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Preface 

Jeffrey D. Scargle'^ 
^ Space Science and Astrobiology Division, NASA Ames Research Center 

SigSpec is a method for detecting and characterizing periodic signals in 
noisy data. This is an extremely common problem, not only in astronomy 
but in almost every branch of science and engineering. This work will be of 
great interest to anyone carrying out harmonic analysis employing Fourier 
techniques. 

The method is based on the definition of a quantity called spectral signif- 
icance - a function of Fourier phase and amplitude. Most data analysts are 
used to exploring only the Fourier amplitude, through the power spectrum, 
ignoring phase information. The Fourier phase spectrum can be estimated 
from data, but its interpretation is usually problematic. The spectral sig- 
nificance quantity conveys more information than does the conventional 
amplitude spectrum alone, and appears to simplify statistical issues as well 
as the interpretation of phase information. 
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Abstract 

SigSpec computes the spectral significance levels for the DFT amplitude spec- 
trum of a time series at arbitrarily given sampling. It is based on the analyti- 
cal solution for the Probability Density Function (PDF) of an amplitude level, 
including dependencies on frequency and phase and referring to white noise. 
Using a time series dataset as input, an iterative procedure including step- 
by-step prewhitening of the most significant signal components and MultiSine 
least-squares fitting is provided to determine a whole set of signal components, 
which makes the program a powerful tool for multi-frequency analysis. Instead 
of the step-by-step prewhitening of the most significant peaks, the program 
is also able to take into account several steps of the prewhitening sequence 
simultaneously and check for the combination associated to a minimum resid- 
ual scatter. This option is designed to overcome the aliasing problem caused 
by periodic time gaps in the dataset. SigSpec can detect non-sinusoidal pe- 
riodicities in a dataset by simultaneously taking into account a fundamental 
frequency plus a set of harmonics. Time-resolved spectral significance analy- 
sis using a set of intervals of the time series is supported to investigate the 
development of eigenfrequencies over the observation time. Furthermore, an 
extension is available to perform the SigSpec analysis for multiple time series 
input files at once. In this MultiFile mode, time series may be tagged as target 
and comparison data. Based on this selection, SigSpec is capable of deter- 
mining differential significance spectra for the target datasets with respect to 
coincidences in the comparison spectra. A built-in simulator to generate and 
superpose a variety of sinusoids and trends as well as different types of noise 
completes the software package at the present stage of development. 
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1. What is SigSpec? 

SigSpec (abbreviation of 'SIGnificance SPECtrum') is a program that 

computes a significance spectrum for a time series. It evaluates the Proba- 
bility Density Function (PDF) of a given DFT amphtude level analytically, 
making use of the theoretical concept introduced by Reegen (2005, 2007). 
The False- Alarm Probability, $fa (A), is the probability that an amplitude 
in the DFT spectrum exceeds a given limit A, and is obtained through in- 
tegration of the PDF (e.g. Scargle 1982). Instead of this frequently used 
quantity, SigSpec calculates the spectral significance (abbreviated by 'sig') 
of an amplitude A by 

sig(^):=-log[$FA(A)] . (1) 

E. g., a sig equal to 5 indicates that the considered amplitude level is due 
to noise in one out of 10^ cases. This value is used as the default threshold 
for the termination of the prewhitening sequence. 

SigSpec performs an iterative process consisting of four steps^: 

1. computation of the significance spectrum, 

2. exact determination of the peak with maximum sig, 

3. a MultiSine least-squares fit of the frequencies, amplitudes and phases 
of all significant signal components detected so far, 

4. prewhitening of the sinusoidal components. The residuals are used as 
input for the next iteration. 

If SigSpec is called without any special settings, it produces four files: 

1. the DFT amplitude spectrum sOOOOOO . dat of the original time series, 
containing also sig and phase, 

2. the DFT amplitude spectrum resspec .dat of the residual time series 
after prewhitening all significant signal components, containing also 

sig and phase, 

3. the residual time series residuals.dat after prewhitening all signifi- 
cant signal components, 

4. a result file called result.dat, which contains a list of significant 
signal components. 



The AntiAlC computation (p. 49) differs slightly from this procedure. 
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5. MiiltiSinc track files, each of which contains a list of the frequencies, 
amplitudes and phases for a single sinusoidal component through the 
prewhitening cascade (pp.38, 89). 

Further options may be applied to obtain spectra, residuals, and/or result 
files (p. 96) in the prewhitening sequence. The MultiSine fits, which are 
performed after each prewhitening step, modify the frequencies, amplitudes 
and phases of previous components. If the user examines the resulting signal 
components and decides not to use all of them, the additional result files 
help to have accurate frequencies, amplitudes and phases in hands also for 
a shorter list of significant sinusoids without re-running the program. 
SigSpec can produce additional files containing 

1. a spectral window for the given time series (pp. 31, 98), 

2. a sampling profile (pp. 31, 91) containing the parameters ao (tj), /3o (i^), 
^0 (w) determining the dependency of the sig on the time-domain sam- 
pling, as well as on frequency and phase in Fourier space (see Reegen 
2007), 

3. a preview of the SigSpec analysis (pp.41, 91), 

4. a Sock Diagram (pp. 32, 96), 

5. a Phase Distribution Diagram (pp. 36, 91) containing probability den- 
sities for the Fourier phases, 

6. a correlogram for each step of the prewhitening sequence (pp. 43, 87). 

These options are deactivated by default. 

Given a sequence of prewhitenings yielding N significant components 
with associated sigs sig(^„), it is desirable to additionally know the prob- 
ability of the entire sequence to be valid. This means that not a single 
erroneous component is allowed. The False- Alarm Probability ^FAn = 
IQ-s^si^n) of an individual peak is the probability that it is generated by 
noise. The complementary probability that the considered peak is true is 
1 — 10~>''s('4„) individual components are statistically independent, 

the cumulative probability of all components to be real is the product of 
the individual probabilities, 



N 

1 - $FA = (1 - 4'FAn) 
n=l 



(2) 
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Consistently, the cumulative sig is introduced as the negative logarithm of 
this total False- Alarm Probability for all identified signal components, $fa, 
and in terms of individual sigs, one obtains 



In consistency with the definition of the sig associated with an amplitude 
in the DFT spectrum, a cumulative sig of 3 means that the prewhitening 
cascade is entirely true in 999 out of 1 000 cases. Or - in other words - in 
one out of 1 000 cases, at least one of the identified components is generated 
by noise. 

Whereas the individual sig of a component in the prewhitening sequence 

may exceed that of the previously identified maximum, the cumulative sig 
is a monotone sequence uniquely decreasing with each additional signal 
component. 

The prewhitening loop stops, if no sig level above a pre-defined limit is 
found. As described in "Program termination" , p. 24, there are three dif- 
ferent criteria that may be applied to determine the conditions for program 
termination: 

1. the number of iterations in the prewhitening sequence, 

2. a lower sig limit for the highest peak in the significance spectrum, 

3. a threshold for the cumulative sig related to a combined probability 
for all detected frequency components. 

The program also supports the subdivision of a time series into a set 
of intervals and the separate analysis of all these parts in order to monitor 
frequency changes of signal components with time. This method will be 
called time-resolved analysis. In this case, the output is somewhat richer, 
as described in "Timc-resolvcd Analysis" (p. 44). 

An immanent problem in the analysis of non-equidistantly sampled time 
series is aliasing. Due to periodic gaps in the data set, a peak in the am- 
plitude spectrum is accompanied by side peaks. Especially if more than 
one sinusoidal component is present in the data, the superposition of side 
peaks may produce a maximum amplitude in the DFT spectrum at a fre- 
quency that has nothing in common with the true signal frequencies. Such a 
misidentification usually damages the complete prewhitening sequence from 
this point on. As pointed out by Reegen (2007), SigSpec appears less prone 
to aliasing than the previously used methods, since the noise component is 




(3) 
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employed into the statistical treatment correctly. However, the superposi- 
tion mentioned above may also lead to erroneous identifications. 

In order to overcome this potential weakness, SigSpec supports the 
simultaneous calculation of more than one signal component simultaneously. 
Instead of picking only the peak associated to maximum sig, a whole set of 
highest peaks is examined, searching all possible combinations for several 
iterations in order to obtain the solution providing a minimum rms residual. 
This function is called AntiAlC (ANTI-ALiasing Correction) mode (p. 49). 

There is a second option to examine multiple peaks simultaneously: a 
non-sinusoidal periodicity is represented by multiple peaks in the DFT am- 
plitude spectrum. One finds a fundamental frequency, plus one or more 
harmonics the frequencies of which are integer multiples of the fundamen- 
tal. In astronomical applications, this may occur if shock waves arc present 
in the stellar pulsation or if surface variations are examined. In such a case, 
it is desirable to take into account not only the fundamental frequency, but 
also all available harmonics at once. This analysis of harmonics is described 
on p. 53). 

SigSpec is capable of analysing multiple time series input files simulta- 
neously. This MuItiFilc! mode (p. 57) speeds up the computation consider- 
ably for time series with the same sampling. 

A further option is the evaluation of differential significance spectra 
(p. 60). The user may specify target vs. comparison data among the in- 
put files. Then SigSpec performs a quantitative comparison of the two 
groups of time series and returns a measure of the probability that a peak 
in a target datasct is 'true', taking into account amplitudes and phases at 
the corresponding frequency in the comparison spectra. In this context, 
the term 'true' is used in the sense of 'not entirely produced by the same 
variability as present in the comparison data'. 

The examples presented here refer to the sample projects available for 
download at http : //www . SigSpec . org. 

2. How to Run SigSpec 

2.1. Projects 

SigSpec is called by the command line 

SigSpec <project> 

where <project> is the name (or path, if desired) of the SigSpec project. 
Before running the program, the user has to provide 

1. a directory <project> used for the output. 



SigSpec User's Manual 



2. a time series input file (see "The time series input file", p. 12). 

The project directory and the time series input file have to be located in 
the same folder. The project directory need not be empty. 

Caution: SigSpec overwrites existing output files! 

There are two conventions for denominating input files. 

1. The standard method is to pass only one time series input file to the 
program. SigSpec expects the file to be named <project>.dat. 

2. For an all-in-one analysis of multiple time series input files, i.e., for 
running SigSpec in MultiFile mode, a leading six-digit index is ex- 
pected. In this case, the first file shall be named 000000 . <project> .dat, 
the next file is 000001 . <pro j ect> . dat, and so on. For more informa- 
tion on the MultiFile mode, please refer to "MultiFile mode" , p. 57. 

Furthermore, the user may pass a set of specifications to SigSpec by 
means of a file <project> . ini (see "The .ini file", p. 13). This file is 
expected in the same folder as the time series input file and the project 
directory. For specifications not given by the user, defaults are used. 

Example. The sample project SigSpecNative provides a run without any 
additional options. The command line is SigSpec SigSpecNative. The 
sample input file SigSpecNative.dat (381 data points) represents V mag- 
nitudes of IC 4996 #89 (Zwintz et al. 2004; Zwintz & Weiss 2006). 

The screen output produced by typing SigSpec SigSpecNative at run- 
time starts with a standard header. 

SSSSSS il SSSSSS 

ss ss ss ss 

SS ii gggg g SS p pppp eeeee ccccc 

SS ii gg gg SS pp pp ee ee cc cc 

SSSSSS ii gg gg SSSSSS pp pp ee ee cc 

SS ii gg gg SS pp pp eeeeeee cc 

SS ii gg gg SS pp pp ee cc 

SS SS ii gg gg SS SS pp pp ee ee cc cc 

SSSSSS ii gggggg SSSSSS pppppp eeeee ccccc 

gg PP 

gg gg PP 

ggggg PP 



SIGnif icance SPECtrujn 
Version 2 . 

by Piet Reegen 
Institute of Astronomy 
University of Vienna 



p. Reegen 



9 



Tuerkenschanzstrasse 17 

1180 Vienna, Austria 

Release date: August 18, 2009 



SigSpec processes the command line, checks whether a project directory 
SigSpecNative is present, and searches for a file SigSpecNative.ini (see 
^'The .ini file", p. 13). Since there is no such file present, four warning 
messages are produced. 

*** start ************************************************** 

command line interface 

Checking availability of project directory SigSpecNative... 
project directory SigSpecNative ok. 
loading .ini file 

Warning: IniFile.SSCols 001 

Failed to open .ini file. 



Warning: IniFile_WCols 001 

Failed to open .ini file. 



Warning: IniFile_LoadIni 001 

Failed to open .ini file. 



Warning: IniFile.Cind 001 

Failed to open .ini file. 



The next task is to load the input file SigSpecNative.dat. SigSpec 

provides the number of rows, the time interval width, and the standard de- 
viation of the observable. 

*** loading time series input file(s) ********************** 

SigSpecNative . dat 

*** "time series properties ********************************* 
points 381, time base 9.17532, rms dev 0.00449592 

The next section contains the specifications for the DFT and significance 
spectra to be calculated. 

*** preparing to run SigSpec ******************************* 

Rayleigh frequency resolution 0.1089880382935977 

oversampling ratio 20.0000000000000000 

frequency spacing 0.0054494019146799 

lower frequency limit 0.0000000000000000 

upper frequency limit 100.4651736990383739 

Nyquist coefficient 0.5000000000000000 

number of frequencies 18437 
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As SigSpec performs the prewhitening sequence, a list of detected sig- 
nal components is displayed. The screen output contains the index of the 
identified component (a line number), the sig, the time-domain rms devi- 
ation before prewhitening the corresponding signal, and the csig. The last 
line contains an insignificant component that meets the breakup condition. 
In the present example, the default breakup condition (the sig to drop below 
5) is applied, which is satisfied in the fourth iteration, where the maximum 
sig is 4. 10802. 

*** running SigSpec **************************************** 

1 freq 3.13205 sig 9.54539 rms 0.00449592 csig 9.54539 

2 freq 3.98471 sig 7.43085 rms 0.00422861 csig 7.42753 

3 freq 5.40684 sig 5.30164 rms 0.0040257 csig 5.2984 

4 freq 17.3677 sig 4.13698 rms 0.00388775 csig 4.10802 

On exit, SigSpec displays a good-bye message. 

Finished. 

************************************************************ 

Thank you for using SigSpec ! 
Questions or comments? 

Please contact Piet Reegen (reegenQastro.univie.ac.at) 
Bye! 

If no special output is selected in a file SigSpecNative.ini, SigSpec 
produces the following output files in the project directory SigSpecNative.- 

• sOOOOOO.dat; DFT and significance spectrum of the original time 
series (without any prewhitening), 

• result.dat; list of significant signal components detected in the time 

series, 

• residuals . dat; residual time series after prewhitening all significant 
signal components listed in result.dat, 

• resspec.dat; DFT and significance spectrum of the residual time 
series residuals.dat. 

Fig.l contains the sample input SigSpecNative.dat, the multisine fit 
to the time series according to the list of significant signal components 
in SigSpecNative/result .dat, and the residuals after subtracting the fit 
from the input time series (file SigSpecNative/residuals.datj. Fig. 2 
refers to the frequency domain: the DFT spectrum of the initial time series 
SigSpecNative/sOOOOOO.dat, the three significant signal components con- 
tained in SigSpecNative/result .dat, and the residual spectrum in the file 
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Figure 1: Black circles: light curve for the sample project SigSpecNative. 
Line: fit formed by three significant signal components (as listed in the file 
SigSpecNative/result.dat). Grey dots: residuals after prewhitening of three sig- 
nificant signal components (file SigSpecNative/residuals.dat). 



SigSpecNative/resspec.dat. For detailed information on the contents of 
the output files, please refer to "Default Output", p. 25. 

Furthermore, the user may pass a set of specifications to SigSpec in a 
file <project> . ini (see "The . ini file" , p. 13). For specifications not given 
by the user, defaults are used. 

2.2. Quiet mode 

If the command line is followed by the letter 'q', i. e. 



SigSpec <project> q 



all screen output is suppressed. 
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20 40 60 80 100 20 40 60 80 100 



fid'] f[d'] 

Figure 2: Grey: Fourier spectra for the sample project SigSpecNative. Left: signifi- 
cance spectra. Right: DFT amplitudes. Top: original spectra, without prewhitening 
(file SigSpecNative/sOOOOOO.dat). Bottom: residual spectra, with three signifi- 
cant signal components prewhitened (file SigSpecNative/resspec.dat). In the top 
panels, the significant components are indicated by dots with dashed drop lines (file 
SigSpecNative/result .dat). The default sig threshold of 5 is represented by a 
horizontal dashed line in the left panels. 



3. Input 

3.1. The time series Input file 

The input file for SigSpec is a time series. The corresponding file has to 
be located in the same folder as the project directory. The only restrictions 
to the format are that the number of items per row has to be constant for 
all rows in the file and that columns have to be separated by at least one 
whitespace character or tab. Dataset entries need not be numeric, except 
for the columns specified as time, observable, and weights (p. 13). 
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3.2. The .ini file 

An optional file <project> . ini consists of a set of keywords and arguments 
defining project-specific parameters for SigSpec. If this file is not present in 
the same folder as the time series input file(s), SigSpec uses a set of default 
parameters. A complete list of keywords is given in "Keywords Reference" , 
p. 86. 

Multiple use of the same keyword or the specification of contradictory 
keywords causes the software to take into account only the last declaration. 
There are only three exceptions: 

1. SigSpec accepts multiple weights columns specified by col:weights 

(p. 14), 

2. multiple subset identifier columns may be specified by col : ssid (p. 16), 

3. the simulator may be used to synthesize multiperiodic signal plus 
various types of noise upon the given sampling, where the keywords 

sim: signal, simrpoly, sim: exp, sim: zeromean, sim: serial, sim: temporal, 
and siin:rndsteps may be used multiply (see "The simulator mode", 
p. 64). 

Caution: SicSpec expects a carriage-return character at the 
end of the file <project>.ini, otherwise the program may hang! 

Lines in the . ini file starting with a # character are ignored by SigSpec. 
This provides the possibility to write comments into the file. Further- 
more, additional characters beyond what is expected in a line (keyword 
plus required number of parameters) is ignored. Thus it is allowed to add 
comments also at the end of the lines containing relevant information for 
SigSpec. 

3.3. Time series columns representing time and observable 

The keywords col : time and col : obs determine those columns in the time 
series input file which contain time values and the observable monitored 
over time, respectively. These columns are required and have to be uniquely 
specified. Column indices start with 1. 

If col: time is not specified, the default value 1 is used. If col: obs is 
not specified, the default value 2 is used. 

Example. The sample project coltimecolobs contains a dataset where the 
time and observable values are found in columns 2 and 3, respectively. The 
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input time series represents the V photometry of IC 4-996 # 89 (see Example 
SigSpecNative, p. 8). The file coltimecolobs.ini contains the two lines 

col: time 2 
col:obs 3 

3.4. Time series columns containing statistical weights 

Furthermore, one or more columns with statistical weights may be specified 
using the keyword col: weights. The keyword accepts two arguments: the 
first is the column index, the second is a floating-point value, say Pn for 
the nth weights column. Given N weights columns indexed according to 
n = 1, N , the total weight for the mth row is evaluated using the weights 
Wnm in the individual columns according to 

N 

:= n • (4) 

n=l 

Weights need not be normalised, this is performed by SigSpec. 

Time, observable, and weights columns have to consist of floating-point 
numbers only. SigSpec checks these columns before starting the compu- 
tations. If a non-numeric entry is found in one of the special columns, the 
program terminates with an error message. 

Caution: SigSpec does not support the exponential euinotation 
(e.g. 4.234E03 or 1.0385e-03)! 

Example. The sample project weights contains a dataset with statistical 
weights in column 3 the squares of which are used by SigSpec, as specified 
by the . ini file entry 

col: weights 3 2 

The input time series weights . dat represents the sampling of IC4996 # 89 
(V), and the magnitudes were synthesized by 

1. a sinusoid with frequency 4-68573 cycles per day, amplitude 17.27 
mmag, 

2. Gaussian noise with 25 mmag rms deviation only for the measure- 
ments between HJD 2452524 and HJD 2452525, 

3. Gaussian noise with 2.5 mmag rms deviation for all other nights. 
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Figure 3: Light curve for the sample project weights. 



The resulting light curve is displayed in Fig. 3. Fig. 4 com,pares the fre- 
quency domain output (a closeup for frequencies below 10 cycles per day) 
with and without employing the weights. Without weights, the peak at J^.l 
cycles per day visible, hut not the most significant one. Moreover, there is 
no signal that reaches the sig threshold of 5.^ 

1 freq 5.68136 sig 3.75547 rms 0.088716 cslg 3.75547 

Column 3 in the time series input file weights.dat contains zeroes for 
the measurements between HJD 2452524 and HJD 2452525 and values of 1 
for the rest. Consequently, in this example, the exponent 2 assigned to the 
keyword col : weights in the file weights . ini does not affect the weighting: 
the result would be the same if, e. g., 

col:weiglits 3 1 

were given instead of 

^The result without weights is found in the project directory noweights. 
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fid'] f[d'] 

Figure 4: Grey: Fourier spectra for the sample project weights. Left: significance 
spectra. Right: DFT amplitudes. Top: spectra of the unweighted time series. Bot- 
tom: spectra employing statistical weights. The significant components are indicated 
by dots with dashed drop lines (file weights/result .dat). The default sig threshold 
of 5 is represented by a horizontal dashed line in the left panels. 



col:weiglits 3 2 

Employing the weights column, SigSpec provides the component at 4-7 cy- 
cles per day as the only significant signal: 

1 freq 4.67968 sig 20.395 rms 0.029129 csig 20.395 

2 freq 30.5489 sig 4.47468 rms 0.0252866 csig 4.47468 

3.5. Time series columns containing subset identifiers 

If the mean magnitude of a light curve is desired to be adjusted to zero for each 
night, or if the data are obtained from more than one site, one may perform 
an individual zero-mean correction for subsets of the total time series. This is 
achieved by the keyword col:ssid in the .ini file. This keyword is followed by 
the index of the column which shall be assigned to subset identifiers and may 
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be multiply defined, if more than one subset identifier column is given. Subset 
identifiers may be arbitrary alpha-numeric strings. 

If col:ssid is specified, SigSpec treats all lines in the dataset with equal 
subset identifiers as individual subsets, for each of which a zero-mean correc- 
tion is performed. Subsequently, SigSpec performs the appropriate statistical 
calculations, taking into account that the zero-mean correction for subsets di- 
minishes the degrees of freedom for noise. 

If more than one subset column is specified, data points are considered to 
belong to the same subset, if all corresponding subset identifiers are equal. 
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Figure 5: Light curve for the sample project subsets. Solid line: Sinusoidal signal 
used as input. 



ExEimple. The sample project subsets contains a dataset with subset iden- 
tifiers in column 3. The input time series subsets.dat represents the sam- 
pling of IC 4996 #89 (V), and the magnitudes were synthesized by adding 

1. Gaussian noise with 5 mmag rms deviation, 

2. a sinusoid with frequency 6.43682 cycles per day and amplitude 2.62 
mmag, 
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3. individual constant zeropoint offsets on a millimag range for each 
night. 



The resulting light curve is displayed in Fig. 5, displaying the input sig- 
nal as a solid line and the data points including the nightly offsets as open 
dots. Fig. 6 compares the resulting frequency domain output (a closeup for 
frequencies below 10 cycles per day) with and without employing the weights. 
If no subdivision according to the subset identifiers is performed, the spectra 
show a peak at 6.4 cycles per day plus several spurious peaks at frequencies 
below 2 cycles per day, which are due to the interpretation of the nightly 
shifts as signal in the 1-cycle-per-day domain and also harmonics.^ Conse- 
quently, SigSpec identifies two additional significant signal components at 
low frequencies: 



^The result without subsets is found in the project directory nosubsets. 
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1 freq 0.575256 sig 14.5784 rms 6.0813 csig 14.5784 

2 freq 7.43176 sig 6.39232 rms 5.51956 csig 6.39232 

3 freq 0.286066 sig B.21B8B rms 5.29531 csig 5.18785 

4 freq 75.1664 sig 3.39587 rms 5.12278 csig 3.38892 



Column 3 in the time series input file weights.dat contains characters 

A to J uniquely assigned to each night. Em,ploying the subsets column elim- 
inates the low-frequency artefacts, and SigSpec provides the component at 
6.4 cycles per day as the only significant signal: 



1 freq 6.4376 sig 8.20485 rms 5.2253 csig 8.20485 

2 freq 75.1661 sig 3.70924 rms 4.95954 csig 3.70922 




Figure 6: Grey: Fourier spectra for the sample project subsets. Left: significance 
spectra. Right: DFT amplitudes. Top: spectra of the total time series. Bottom: 
spectra of the subdivided time series. The significant components are indicated by 
dots with dashed drop lines (file subsets/result .dat). The default sig threshold of 
5 is represented by a horizontal dashed line in the left panels. 



3.6. Lower frequency limit 

The frequency where the computation of spectra starts is specified by the key- 
word If req. By default, the lower frequency limit is zero. 
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Example. The sample project limits illustrates the use of the keyword 
Ifreq. It uses the V photometry of IC 4996 #89 as input file limits.dat, 
and the file limits.ini contains the line 

Ifreq 1 

which forces SigSpec to perform all computations for frequencies > 1 
cycle per day. The spectrum limits/sOOOOOO.dat is displayed in Fig. 7. 




1 2 3 4 51 2 3 4 5 



Figure 7: Grey: Fourier spectra for the sample project limits. Left: significance 
spectrum. Right: DFT amplitudes. The significant components are indicated by dots 
with dashed drop lines (file limits/result. dat). The default sig threshold of 5 is 
represented by a horizontal dashed line in the left panel. The frequency range is set 
from 1 to 5 cycles per day using the keywords Ifreq and ufreq. 



3.7. Upper frequency limit and Nyquist Coefficient 

The keyword ufreq allows to determine the upper limit of the frequency interval 

to be considered. 

An alternative method is the automatic determination of this limit by means 
of the Nyquist Coefficient (keyword nycoef ). For equidistantly sampled time 
series with sampling interval width 5t, there is a uniquely defined Nyquist Fre- 
quency 



In case of non-equidistant sampling, each sampling interval between two con- 
secutive time values may be considered to produce its individual Nyquist Fre- 
quency, whence this limit is ambiguous. In this case, the Nyquist Coefficient for 
an arbitrarily given frequency is introduced as the fraction of sampling intervals 
in the time domain the individual Nyquist Frequency of which is higher than 
the frequency under consideration. This provides to select an upper frequency 
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limit by specifying a minimum Nyquist Coefficient. E.g., specifying a Nyquist 
Coefficient of 0.5 (which is the default value) guarantees that at least half of 
the information contained by the spectrum in the considered frequency range 
is below the Nyquist limit. 

Additional information is available by setting the keyword nyscEm in the 
.ini file. If this keyword is specified, SigSpec creates a file nyscan.dat 
in the project directory containing the Nyquist Coefficients over the specified 
frequency range. 

Example. The sample project limits illustrates the use of the keyword 
ufreq. The line 

ufreq 5 

in the file limits, ini restricts all computations performed by SigSpec to 
frequencies below 5 cycles per day. The spectrum limits/sOOOOOO.dat (sig 
and amplitude) is displayed in Fig. 7. A comparison with the screen output 
in Example SigSpecNative, p. 10, where no restrictions to the frequency 
range apply, shows that the screen output in this example contains one line 
less: 

1 freq 3.13205 sig 9.54539 rms 0.00449592 cslg 9.54539 

2 freq 3.98471 sig 7.43085 rms 0.00422861 cslg 7.42753 

3 freq 2.664 sig 4.60182 rms 0.0040257 cslg 4.60117 

The signal component at 5.4 cycles per day is not detected, because it is 
outside the specified frequency range. 

Example. The sample project nyos illustrates the use of the keywords 
nycoef and nyscan for the V photometry of IC 4-996 #89. The line 

nycoef 0.99 

in the file nyos. ini provides an upper frequency limit of 110.77 cycles per 
day. The keyword nyscan is given, and the file nyos/nyscan.dat contains 
the Nyquist Coefficients for frequencies from to 110.77 cycles per day, as 
displayed in Fig. 8. 

3.8. Frequency spacing and oversampling ratio 

The width of the interval between consecutive frequencies may be specified by 
the keyword freqspacing. 

An alternative method is the automatic determination of the spacing by 
means of the oversampling ratio. In case of equidistantly sampled time series. 
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Figure 8: The file nyos/nyscan.dat contains the Nyquist coefficients depending on 
frequency for the sample project nyos. 



the frequency spacing Is defined by 

Sf:=^, (6) 

where T denotes the width of the total time Interval. For non-equidistant time 
series, It Is advisable to use a denser frequency sampling, 

Sf:=^, (7) 

where is the oversampling ratio. This quantity may be specified using the 
keyword osratio. The default value Is 20, which Is - In most cases - sufficient 
for practical use. 



Example. The sample project limits illustrates the use of the keyword 
f reqspacing, an example for the keyword osratio is provided in the sample 
project nyos. Both samples use the V photometry of IC 4996 # 89 as input 
time series. The line 



f reqspacing 0.001 



in the file limits.ini forces SigSpec to calculate Fourier amplitudes and 
sigs for every 0.001 cycles per day. In the file nyos.ini, there is a line 

osratio 12 



which overrides the default oversampling ratio of 20. Fig. 9 compares the 
standard spacing from Example SigSpecNative, p. 8), with the spacings 
obtained applying the two above modifications. 
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Figure 9: Close-up for the significance spectra generated by the projects 
SigSpecNative, limits and nyos around the main peak for the V photometry of 
IC 4996 #89. Different settings for frequency spacing and oversampling ratio are 
applied. 



3.9. Accuracy of MultiSine fits 

By default, SigSpec performs a MultiSine least-squares fit each time a new 
significant signal component is detected. The fitting procedure improves the fre- 
quencies, amplitudes, and phases of all previously detected signal components. 
The algorithm applies Newton's root finding scheme to the first derivatives of 
the residual variance. 

The precision of computed frequencies via MultiSine least-sqares fits is de- 
fined according to 

Sf := , (8) 

Tsig2 

where ^, and k are the accuracy parameters for MultiSine fitting. The default 
value of /X is 10~^, that of k is 1. They may be adjusted by the keyword 
multisine :newton, followed by n, K and a third parameter determining the 
relative tolerance of the time-domain rms error between consecutive iterations 
(see next paragraph). To reduce the potential time consumption of the proce- 
dure, /i can be adjusted to achieve an overall scaling of the frequency accuracy. 
The value of k determines the dependence of the demanded frequency precision 
on the sig of the peak under consideration. Setting k = yields the Rayleigh 
frequency resolution, for k = 1 one obtains the Kallinger resolution (Kallinger, 
Reegen &i Weiss 2008). 

The criterion on which MultiSine fitting is based is the minimisation of rms 
residual. Thus the rms residual is demanded to decrease from one iteration 
to the next. Otherwise the fitting procedure is terminated. To speed up the 
computation, the MultiSine fit can be terminated, if the relative improvement of 
rms residual drops below a positive number. The default value 10~^. This value 
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may be re-adjusted by the third parameter to the keyword multisine :newton. 

The two termination conditions are linked by a logical 'and', i.e. the Mul- 
tiSine fitting procedure stops if both the desired frequency accuracy is reached 
for all signal components and the improvement of residual rms drops below the 
specified threshold. 

There is an optional keyword, multisine: lock, that forces the prewhiten- 
ing cascade to rely on the "raw" frequencies, amplitudes and phases (i. e. those 
without MultiSine fitting). Resulting signal components are improved to ob- 
tain a least-squares fit in each iteration, but this improvement is ignored in 
the prewhitening sequence. The default setting is that the improved param- 
eters are used for the subsequent analysis (as also obtained by the keyword 
multisine : iinlock. 



Example. The sample project multisine illustrates the application of the 
keyword multisine :newton to the IC 4996 # 89 photometry (V) as input 
file multisine.dat. The file multisine.ini contains the line 



multisine 0.001 0.01 



which reduces the accuracy of the MultiSine fit, compared to the default val- 
ues 0.000001, 1, 0.000001, respectively. The second parameter refers to the 
Rayleigh frequency resolution rather than the (default) K ailing er frequency 
resolution. The screen output provides four entries: 



1 freq 3.13205 sig 9.54539 

2 freq 3.98472 sig 7.43087 

3 freq 5.40686 sig 5.29838 

4 freq 17.3677 sig 4.13727 



rms 0.00449592 csig 9.54539 
rms 0.00422861 csig 7.42755 
rms 0.0040257 csig 5.29516 
rms 0.00388775 csig 4.10809 



For comparison, the project SigSpecNative, p. 8, employs the default 
settings. 

For the first entry, there is no difference between the two results, but 
due to propagation of uncertainties, the following entries show slight and 
increasing deviations. As expected, the rms errors of residuals are higher if 
the accuracy is reduced. 



3.10. Program termination 

There are three possibilities to specify a criterion for program termination: 
1. the number of iterations (keyword iterations), 



2. a lower sig limit (keyword siglimit). 
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3. the reliability of the entire analysis is determined by the cumulative sig. It 
is based on the probability that at least one of the frequency connponents 
detected so far is due to noise. A threshold in terms of cumulative sig 
may be defined using the keyword csiglimit For an introduction to the 
cumulative sig, please refer to p. 5. 

Multiple specifications in terms of these keywords cause the prewhitening 
cascade to terminate if one of the limits is reached. 

The default assignment for siglimit is 5. This pre-definition may be 
deactivated by defining 

siglimit 

in the .ini file. The limits iterations and csiglimit are switched off by 
default. 



Example. The sample project terminate contains a comMnation of the 
keywords siglimit, csiglimit and iterations, applied to the V photom- 
etry of IC 4996 #89 as input file. For a comparison to the standard output, 
please refer to Example SigSpecNative, p. 8. The file terminate . ini con- 
tains a combination of three keywords: 

siglimit 
csiglimit 3 
iterations 10 

The first line deactivates the default setting of 5 for the sig limit. The 
combination of the second and third line forces SigSpec to terminate after 
10 iterations, or earlier, if the cumulative sig drops below 3. The screen 
output provides seven lines, corresponding to six significant signal compo- 
nents: 



1 freq 3.13205 

2 freq 3.98471 

3 freq 5.40684 

4 freq 17.3677 

5 freq 3.67101 

6 freq 52.5182 

7 freq 41.7372 



sig 9.54539 
sig 7.43085 
sig 5.30164 
sig 4.13698 
sig 3.73187 
sig 3.41319 
sig 3.02872 



rms 0.00449592 
rms 0.00422861 
rms 0.0040257 
rms 0.00388775 
rms 0.00378701 
rms 0.00369756 
rms 0.00361981 



csig 9.54539 
csig 7.42753 
csig 5.2984 
csig 4.10802 
csig 3.57943 
csig 3.18744 
csig 2.80001 



The cumulative sig of 2.8 for component 7 is responsible for program 
termination before the limit of 10 iterations is reached. 



4. Default Output 

All output files are written into the project directory. A six-digit index denotes 
the iteration in the prewhitening cascade. E.g., an index 000000 represents a 
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file obtained from the input data without any prewhitening, 000001 denotes a 
file after prewhitening of the first sinusoidal component. The general annotation 
#iteration# will be used for this six-digit identifier. 



Example.^ The sample project output illustrates how to adjust the out- 
put o/ SigSpec. The input file output.dat represents 16 nights (992 data 
points) of Str0mgren y photometry (Vienna University APT, T6; Strass- 
meier et al. 1997) of the Delta Scuti star EE Cam (Breger, Rucinski & 
Reegen 2007). The light curve is displayed in Fig. 10. 



*The sample project output is the most time consuming sample of all. The computa- 
tion takes 90 minutes on an Intel Corc2 CPU T5500 (1.66GHz) under Linux 2.6.18.8-0.9- 
default 1686. This is mostly due to the calculations of the Sock and Phase Distribution 
Diagrams. In order to speed up the program, the user may switch off these operations 
by placing a # character at the beginning of all lines containing keywords sock: . . . and 
phdist : . . . in the file output . ini. 




Figure 10: Light curve for the sample project output. 



The vast amount of output provided by Sock Diagrams and Phase Dis- 
tribution Diagrams makes it necessary to restrict the frequency interval in 
the file output.ini. Especially close to zero frequency, the output may be 
tremendous. Thus the very low frequencies are avoided: 

Ifreq 1 

ufreq 16 

The frequency spacing is adjusted to speed up the computations of Sock and 
Phase Distribution Diagrams. 

freqspacing 0.005 

All other entries in the file output . ini apply to output files and are 
discussed in the subsequent sections. 
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4.1. Spectra 

By default, two spectra (files sOOOOOO.dat and resspec.dat) are generated. 
The file sOOOOOO . dat contains the spectrum of the original time series, and the 
file resspec . dat represents the residual spectrum after finishing the prewhiten- 
ing sequence. 

The columns are 

1. frequency [inverse time units], 

2. sig, 

3. DFT amplitude [units of observable], 

4. Fourier-space phase angle [rad], 

5. Fourier-space phase angle of maximum sig [rad]. 

To achieve consistency with the output for differential significance spectra 
(see p. 60), two further columns are found containing values —1 and only. 
The phase angles 9 are given according to a trigonometric fit, 

F{t):=Acos{2-Kft-6), (9) 

with amplitude A and frequency / as given in the file. This convention is 
compatible to the definition of phase in Fourier space. This definition of phase 
is consistently used for all types of SigSpec output. 

If the keyword spectra is provided in the .ini file, additional output 
files s#iteration# . dat are generated. The index #iteration# starts with 
000001, denoting the residual spectrum after the first prewhitening step. 

The keyword spectra expects two integer parameters. The first defines the 
number of iterations for which these files shall be generated. A negative number 
causes SigSpec to generate files for all iterations. The second parameter has to 
be a positive number and defines a step width. If it is set 1, a file is generated 
after each iteration, if it is set 2, after every second iteration (starting with 
s000002.dat), and so on. 

Example. The sample project output uses the keyword spectra in the file 
output . ini, namely 

spectra 10 2 

Spectra are written only during the first 10 iterations of the prewhiten- 
ing sequence. The second parameter provides only every second file to be 
generated. In this example, the following files are produced: 
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output/sOOOOOO . dat 
output/s000002 . dat 
output /s000004.dat 
output/sOOOOOe . dat 
output/sOOOOOS . dat 
output/sOOOOlO . dat 

In addition, the file resspec.dat contains the residual spectrum after all 
iterations. 

4.2. Residual time series 

By default, a file residuals .dat is generated. It represents the residuals after 
prewhitening all signal components found significant. The column format is the 
same as for the time series input file. 

If the keyword residuals is provided in the .ini file, additional files 
t#iteration# . dat are generated, where the index #iteration# starts with 
000001, denoting the residuals after the first prewhitening step. 

The keyword residuals expects two integer parameters. The first defines 
the number of iterations for which these files shall be generated. A negative 
number causes SigSpec to generate files for all iterations. The second pa- 
rameter has to be a positive number and defines a step width. If it is set 1, a 
file is generated after each iteration, if it is set 2, after every second iteration 
(starting with t000002.dat), and so on. 

Example. The sample project output uses the keyword residuals in the 
file output.ini, namely 

residuals -1 5 

Setting the first parameter — 1 provides residual time series during the 
entire prewhitening sequence. The second parameter provides only fifth sec- 
ond file to he generated. Since the number of iterations in this example is 
40, the following files are produced: 

output/t000005 . dat 
output/tOOOOlO . dat 
output/tOOOOlS . dat 
output/t000020 . dat 
output/t000025 . dat 
output/tOOOOSO . dat 

In addition, the file residuals.dat contains the residual time series after 
all iterations. 
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4.3. Result files 

The file result.dat contains a list of all identified sig maxima. This file 
consists of seven columns providing 

1. frequency [inverse time units], 

2. sig, 

3. amplitude [units of observable], 

4. phase [rad], 

5. rms scatter of the time series before prewhitening, 

6. point-to-point scatter of the time series before prewhitening, 

7. the cumulative sig for all frequency components detected so far. 

Columns 3 and 4 represent amplitude and phase as the result of a least- 
squares fit to the time series at the present prewhitening stage (i.e. after sub- 
traction of all previously identified signal components) for the frequency of 
maximum significance. 

The last line in the file contains zeroes for frequency, amplitude, and phase. 
The non-zero values refer to the (cumulative) sig of the most significant com- 
ponent below the threshold, and to the rms and point-to-point scatter after the 
last prewhitening step, respectively. This final line is suppressed if the criterion 
iterations is responsible for program termination. 

If the keyword results is provided in the .ini file, additional result files 
r#iteration#.dat are generated, where the index #iteration# starts with 
000001, denoting the result of the first iteration. The files contain the signif- 
icant components within the prewhitening cascade as preliminary results. The 
MultiSine least-squares fits which are performed at each step of the prewhiten- 
ing sequence modify frequencies, amplitudes and phases. Therefore it may be 
useful to have additional results from earlier iterations in hands, if the user 
decides not to use all components found by SigSpec without re-running the 
program. 

The keyword results expects two integer parameters. The first defines the 
number of iterations for which these files shall be generated. A negative number 
causes SigSpec to generate files for all iterations. The second parameter has 
to be a positive number and defines a step width. If it is set 1, a result file is 
generated after each iteration, if it is set 2, after every second iteration (starting 
with r000002.dat), and so on. 

Example. The sample project output uses the keyword results in the file 
output . ini, namely 
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results -1 1 

providing result files r000001.dat, r000002.dat,..., for all iterations of 

the entire prewhitening sequenee. In addition, the final result after all 
prewhitening iterations is contained in the file results.dat. 

5. Analysis of the Time-domain Sampling 

Example. The sample project output contains the output of a spectral 
window, a sampling profile, a sock diagram, a phase distribution diagram, 
a preview, and correlograms. 

5.1. Spectral window 

The spectral window is computed, if the keyword win is given in the . ini file. 
This keyword does not require any parameters. The output is provided in the 
file win.dat. It consists of three columns referring to 

1. frequency [inverse time units], 

2. amplitude [units of observable], 

3. Fourier-space phase angle [rad]. 

Example. The sample project output contains the output of a spectral win- 
dow. The file output . ini contains the keyword win, and the corresponding 
output is found in the file output /win.dat and displayed in Fig. 11. The 

frequency limits determined by the lines 

Ifreq 1 
ufreq 16 

also apply to the spectral window. 

5.2. Sampling profile 

The sampling profile is an essential part of the sig computation. All parameters 
to describe the influence of the time series sampling in Fourier space is entirely 
contained in the three parameters ao, Pq, and 9q. The values of ag and /3o 
are measures for the maximum and minimum sig for all phase angles at a given 
frequency, and the angle Oq determines the phase angle where maximum sig is 
obtained at the frequency under consideration. A detailed description is given 
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Figure 11: Spectral window for the sample project output. 



by Reegen (2007). If the keyword profile is provided in the .ini file, the 
sampling profile for the given time series is written to the file profile.dat. 
The four columns refer to 

1. frequency [inverse time units], 

2. ao, 

3. /3o, 

4. ^0 [rad]. 



Example. In the file output . ini, the keyword profile is given and forces 
SigSpec to generate an output file output /prof ile.dat representing the 
sampling profile displayed in Fig. 12. 

5.3. Sock Diagram 

The computation of a Sock Diagram is an optional add-on of SigSpec. If 
the keyword sock:phases is given in the .ini file, SigSpec computes sock 
significances, i.e. sig levels for a constant signal-to-noise ratio at a set of 
different phase angles, and for all frequencies for which spectra are calculated. 
As described by Reegen (2007), the expected sig level for a given amplitude 
signal-to-noise ratio at constant frequency and phase angle is proportional to 
the squared amplitude signal-to-noise ratio. Sig levels in the Sock Diagram are 
normalised to an expected value of 1, corresponding to an amplitude signal-to- 
noise ratio 

A 2 . 
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Figure 12: Sampling profile for the sample project output. The lower curve refers to 

Qo, the upper curve to /3o- The orientation angle of the rms error ellipse, 60 is not 
plotted. 



The sig level for an arbitrary signal-to-noise ratio may be deduced by multiplying 
the sig displayed in the Sock Diagram by ^^-^ (^^) w 0.341 (^^) . 

Furthermore, the phase angle in the Sock Diagram is given w/ith respect to 
^0, i e. the phases with maximum sock significance are consistently aligned to 
zero phase for all frequencies. 

The number of phase angles in the interval [0,7r[ to be taken into account 
for each frequency of the spectrum has to be given as an argument to the 
keyword sock:phases in the . ini file. The sig levels in the phase intervals 
[0,7r[ and [tt, 27r[ are symmetric according to 

sig(A,w,0) = sig(^,w,,^ + 7r)V(/., (11) 

but both given in the output file sock.dat for completeness. The result repre- 
sents a three-dimensional polar diagram of the sampling properties of the time 
series Input file. To enhance the corresponding plot resolution, the number of 
phases specified with the keyword sock: phases in the .ini file is scaled by 
the maximum sig for each frequency. For sig maxima < 1, the specified number 
is used, for sig maxima between 1 and 2, the number is doubled, and so on. 

To enhance the quality of Sock Diagrams produced by SigSpec, the key- 
word sock: fill can be provided to specify a fill factor (as a floating-point 
number following the keyword). It is used for adaptive oversampling of frequen- 
cies according to the differences of maximum sigs for consecutive frequencies. 
The fill factor is the (rounded) number of additional frequencies per unit of 
sig difference. In other words, providing sock: fill 10 guarantees that the 
resolution of the resulting Sock Diagram along the sig axis does not exceed 0.1, 
and an appropriate combination of the keywords sock:phases and sock:f ill 
produces a Sock Diagram that mimics a continuous surface when plotted in 
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3D style. The default argument of sock: fill is 0, which means that adaptive 
oversampling is switched off. 

Caution: the Sock Diagram may become a huge file! Especially 
for very low frequencies, a tremendous amount of data may be 
expected. Thus it is advisable either to exclude this frequency 
region (keyword Ifreq) or to assign very low values to sock: phases 
and sock: fill. 

The user may choose to obtain the Sock Diagram in three-dimensional 
cylindrical (default, or keyword sock:cyl) or cartesian coordinates (keyword 
sock: cart). 

In any case, the output file sock.dat consists of three columns. In cylin- 
drical coordinates, the columns refer to 

1. height coordinate: frequency [inverse time units], 

2. azimuthal coordinate: phase with respect to the sig maximum [rad], 

3. radial coordinate: sock significance. 

In cartesian coordinates, the columns refer to 

1. frequency [inverse time units], 

2. sock significance component in the direction of the sig maximum, 

3. sock significance component in the direction of the sig minimum. 

The keywords sock:colmodel:lin and sock: colmodel :rcink permit to 
choose between two different colour models assigning RGB colours to the 
data points of the Sock Diagram. The linear model (sock: colmodel : lin) 
uses the sock significance as it is for colour scaling, whereas the rank model 
(sock: colmodel: rank) relies on a rank statistics of sock significances. 

Caution: the computation of ranks may be very time-consum- 
ing! The progress control displayed during the calcucation of the 
rank statistics does not provide linear percentages in time. The 
percentage values refer to the portion of ranks among the number 
of data points that are finished. 

A sequence of keywords sock: colour determines a colour path that is 
assigned to the data points in the Sock Diagram. The keyword is followed by 
four floating-point arguments. The first three arguments refer to the three RGB 
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channels. Using the linear model (sock: colmodel : lin), the fourth argument 
is the sock significance to which the given colour has to be assigned. For the 
rank model (sock: colmodel :rcink), the fourth argument is a floating-point 
value in the interval [0, 1] and determines the fractile of data points to which the 
given colour has to be assigned. A value of, e. g., 0.5 assigns the specified colour 
to the median of sock significances. SigSpec performs a linear interpolation 
along this colour path and assigns a fourth column to the output file sock.dat 
containing RGB values. For entries beyond the start or end of the colour path, 
the start or end colour is used, correspondingly. 

'•nr-- fid-' 
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Figure 13: Sock Diagram for the sample project output. 

Example. A linear colour model that produces colours from white via red, 
yellow, green, cyan, blue, and magenta to black is produced by the following 
specifications: 
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sock : colmodel : lin 

sock: colour 255 255 255 .5 

sock: colour 255 .9 

sock: colour 255 255 .95 

sock: colour 255 1 

sock: colour 255 255 1.05 

sock: colour 255 1.1 

sock: colour 255 255 1.2 

sock: colour 2 

Example. A rank colour model producing greyscale coding is obtained by: 

sock : colmodel : rank 

sock: colour 

sock: colour 255 255 255 1 

Example. The Sock Diagram in the sample project output is generated 
according to the following entries in the file output . ini ; 

sock:cyl 

sock: phases 45 

sock: fill 10 

sock : colmodel ; lin 

sock: colour 255 255 255 0.98 

sock: colour 1.02 

The resulting file output/sock. dat is displayed in Fig. 13. 
5.4. Phase Distribution Diagram 

In addition to the spectral window and Sock Diagram, SigSpec can compute 
the probability density of phase angles at given frequency as a function of 
frequency. This is an alternative way to examine the properties of the sampling 
in the time domain and activated by the keyword phdist :phases in the .ini 
file. The resulting probability densities are normalised in a way that their mean 
over all phase angles is 

The number of phases to be computed is increased according to the eccen- 
tricity of the phase distribution at a given frequency. 

In perfect analogy to the Sock Diagram (p. 32), there are further keywords 
available to adjust the contents of the output file phdist.dat. 

• phdist: fill determines a filling factor for additional frequencies if the 
changes between the phase distributions for two adjacent frequencies are 
too high. 

• phdist : cyl specifies cylindrical coordinates (height: frequency, azimuth: 
phase, radial: probability density of phase) 
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• phdist:cart specifies cartesian coordinates 

• phdist : colmodel : lin 

• phdist: colmodel: rank 

• phdist : colour 

Please refer to "Sock Diagram" (p. 32) for a detailed description. 

Caution: For frequencies close to zero, tremendous output may 
be expected! Try to avoid the very low frequency region, if pos- 
sible. 




ossti, fi.iss 0.160 MUZ. !Jm 

pwlJahUity:tlerisitj^; 

Figure 14: Phase Distribution Diagram for the sample project output. 
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Example. The Phase Distribution Diagram in the sam,ple project output 
is generated according to the following entries in the file output . ini ; 

phdist : cart 
phdist : phases 30 
phdist: fill 50 
phdist : colmodel : raiik 
phdist: colour 223 223 223 
phdist: colour 31 31 31 1 

The resulting file output /phdi st .dat is displayed in Fig. 14. 
6. MultiSine Output 

After each step of prewhitening, SigSpec performs a MultiSine least-squares 
fit over all significant signal components detected so far. Two optional types of 
output may help the user comprehend how this procedure performs at runtime. 

6.1. MultiSine tracks 

The MultiSine tracks allow to examine the changes in frequency, amplitude 
and phase of each signal component in the prewhitening cascade and are an 
alternative representation of the result files. Instead of a file index that refers 
to the iteration, the file index of the MultiSine track files in#index# . dat refers 
to the index of the component in the result files and lists its 

1. frequency [inverse time units], 

2. amplitude [units of observable], 

3. phase [rad] 

for each prewhitening step. In other words, a result file displays all the compo- 
nents for an iteration, whereas the MultiSine track file displays all the iterations 
for a component. Thus the MultiSine track provides a good estimate for the 
reliability and accuracy of the components found significant. 

If the keyword mstracks is provided in the .ini file, MultiSine track files 
m#index# . dat are generated, where the index #index# starts with 000001, 
denoting the first significant signal component. 

The keyword mstracks expects two integer parameters. The first defines 
the number of iterations for which entries in the MultiSine track files shall 
be generated. A negative number causes SigSpec to generate entries for all 
iterations. The second parameter has to be a positive number and defines a 
step width. If it is set 1, a line in the MultiSine track files is generated for each 
iteration, if it is set 2, for every second iteration (starting with r000002.dat), 
and so on. 
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Example. The sample project output uses the keyword mstracks in the 
file output.ini, namely 



mstracks -1 1 



providing MultiSine tracks mOOOOOl .dat, in000002.dat,..., for all iterations 
of the entire prewhitening sequence. The MultiSine track for the primary 
signal component (file mOOOOOl.datj is displayed in Fig. 15. 




Figure 15: MultiSine tracl< of the most dominant signal component in the light curve 
of the sample project output (8.59 cycles per day), according to the output file 
output/mOOOOOl.dat. Left: amplitude vs. frequency. Mid: phase vs. frequency. 
Right: amplitude vs. phase. 



6.2. MultiSine profiles 

A closer examination of the accuracy of the MultiSine fitting procedure is pro- 
vided by the MultiSine profiles. If the user specifies the keyword msprof s in 
the .ini file, SigSpec produces additional output files f #iteration#.dat, 
a#iteration# . dat, and p#iteration# . dat. The idea is to evaluate the 
rms residual through modifying a single parameter of a single signal compo- 
nent, keeping all other parameters constant. Performing this operation for 
the frequency of each component produces a set of rms-residual-vs. -frequency 
plots, all written to the file f #iteration# . dat. Correspondingly, the files 
a#iteration# . dat and p#iteration# . dat contain rms-residual-vs. -amplitu- 
de and rms-residual-vs.-phase plots. The frequencies are scanned around the 
best fit by zfc _ ] . , T denoting the time interval width of the input time series, 
and sig referring to the signal component under consideration. The amplitudes 
are calculated from zero to twice the amplitude of best fit, and the phases in a 
range of ±7r around the phase of best fit. 
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The keyword msprof s is followed by three integer values, the first denoting 
the number of data points an individual MultiSine profile shall consist of.^ 
The second parameter defines the number of iterations for which MultiSine 
profiles shall be generated. A negative number causes SigSpec to generate 
profiles for all iterations. The third parameter has to be a positive number and 
defines a step width. If it is set 1, a MultiSine profile is generated after each 
iteration, if it is set 2, after every second iteration (starting with f000002.dat, 
a000002.dat, p000002.dat), and so on. 

The output files consist of seven columns: 

1. frequency, amplitude, or phase, respectively, 

2. rms residual, 

3. first-order approximation, based on the tangential gradient (which should 
be zero, so that the deviation from zero is a measure of the accuracy of 
the MultiSine fitting procedure), 

4. second-order approximation, based on the first and second derivatives at 
the parameter value of best fit, 

5. point-to-point scatter, 

6. index of the signal component, 

7. index of the harmonic (0 for fundamental), see "Analysis of Harmonics", 
p. 53. 

For each signal component, the first row refers to the parameter value of 
best fit, as used in the result file. 

Example. The sample project output uses the keyword msprof s in the file 
output . ini, namely 

msprofs 10000 50 3 

providing MultiSine profiles 000003 . dat, a000003.dat, pOOOOOS . dat 
(f000006.dat, a000006.dat, p000006 . dat j,..., for a maximum of 50 it- 
erations of prewhitening sequence. Each MultiSine profile is specified to 
contain 10 000 data points. The number of significant components found in 
the time series output . dat is 33, so that the last set of MultiSine profiles 
ff 000033 . dat, a000033 . dat, p000033 . dat ) refers to the final solution con- 
tained in the file result . dat. The MultiSine ptofiles for the primary signal 
component at 8.59 cycles per day are displayed in Fig. 16. 

^Due to the internal accuracy of the index computation, the actual number of points 
may differ from this value by ±1. 
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Figure 16: MultiSine profiles of the most dominant signal component in the light 
curve of the sample project output (8.59 cycles per day), according to the output 
files output/f 000033.dat, output/a000033.dat, output /pOOOOSS . dat. Left: rms 
residual vs. frequency. Mid: rms residual vs. amplitude. Right: rms residual vs. phase. 
The solid blacl< line refers to the rms residual, the dashed blacl< line to the tangential 
gradient at the value of best fit (which should be zero), and the dashed-dotted black 
line to a second-order approximation based on the first two derivatives of rms residual. 
The solid grey line represents the point-to-point scatter. 



7. Preview 

Since the prewhitening cascade performed by SigSpec may be extremely time 
consuming, the program can compute a preview. This add-on is activated by 
the keyword preview in the .ini file. 

Whereas the significance spectra rely on the False-Alarm Probability com- 
pared to a noise dataset with the same rms error as the given time series (or 
series of residuals, respectively), the significance spectrum provided in the file 
preview.dat represents a set of identified maxima in the significance spectrum 
of the original time series, but based on the point-to-point scatter in the time 
domain rather than on the standard deviation of observables. The lower sig 
limit for writing a local maximum to the file preview.dat is specified as the 
argument to the keyword preview in the .ini file. 

The calculation of the sig is based on the assumption that only the point- 
to-point scatter is random, and everything else contributing to the rms error 
represents signal that will be prewhitened in the course of the subsequent loop. 
The preview output is to be considered as a rough estimate for the final result 
obtained by step-by-step prewhitening and contains not only the intrinsic vari- 
ations but also all aliases, which will not occur in the following analysis. The 
file preview.dat consists of four columns referring to 



1. frequency [inverse time units], 
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2. sig, 

3. DFT amplitude [units of observable], 

4. phase [rad]. 

Example. The sample project preview contains a preview file for the V 
photometry of IC 4996 # 89. In the file preview, ini, the line 

preview : siglimit 5 

sets the sig threshold to 5. The output file preview/preview. dat contains 
11 components, sorted hy frequency. The frequencies and corresponding sigs 
in the first two columns are 

. 9945158494303480 6 . 1674140356166323 

1 . 9917563998155081 6 . 9302735632389876 

2 . 1388902515119841 6 .7175642729893710 

2 . 9835475482878717 8 . 6773027802854656 

3 . 1361308018982843 9 . 4899187898938777 

3 . 9862375005883859 8 . 9589776551282210 

4 . 1333713522847031 8 . 7492615592402885 

4 . 9780286490607102 5 . 0523760377159039 

5 . 1360613045861099 5 . 3438911274207790 

11 . 0268647743572874 5 . 5214237212500406 

12 . 0241053247411784 5 . 6674302270769710 

Fig. 17 displays the significance and amplitude spectrum of the original time 

series. Since the preview does not employ any prewhitenings, aliases are 

present in the file. 

• The signal at 3.132 cycles per day corresponds to components # 3, 5, 
1, and 9. 

• The signal at 3.986 cycles per day corresponds to components # 1, 2, 
4, 6, and 8. 

• The signal at 5.409 cycles per day is not found in the preview. In 

the result of the prewhitening sequence, its sig is 5. 02. Since the sig 
in the preview relies on the rms deviation of the original time series, 
whereas the final sig is based on the rms deviation after the previous 
prewhitening step, the sig associated to this frequency falls below the 
pre-selected threshold of 5 in the preview. The significance spectrum 
(grey line in the left panel of Fig. 17) shows a peak at the frequency 
under consideration the sig of which is ~ 4-8. 



• Components # 10 and 11 are 1-cycle-per-day aliases of each other, 
but do not show up in the final result, preview/result . dat. 
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Figure 17: Grey: Fourier spectra for tlie sample project preview. Left: significance 
spectrum. Right: DFT amplitudes. The significant components in the preview are 
indicated by dots with dashed drop lines (file preview/preview. dat). The default 
sig threshold of 5 is represented by a horizontal dashed line in the left panel. 

8. Correlograms 

SigSpec is able to compute correlograms of the time series for each stage 
of prewhitening. The correlogram files are named c#iteration#.dat. The 
calculation of correlograms is activated by the keyword correlograms, which 
requires three integer parameters. The first parameter represents the maximum 
order to which to compute serial correlations, i. e. the limit of index lag for each 
correlogram. Setting it zero forces SigSpec to adjust it to half the number of 
data points in the time series. The second parameter is the maximum number 
of iterations for which to compute correlograms. If the number of prewhitening 
iterations exceeds this value, then no correlogram is generated for the iterations 
after this limit. If a number < is given, then a correlogram is computed for 
each prewhitening stage. The third parameter has to be a positive number and 
defines a step width. If it is set 1, a file is generated after each iteration, if it 
is set 2, after every second iteration (starting with c000002.dat), and so on. 
The correlogram computation is switched off by default. 

A file rescorr.dat is generated, if the keyword correlograms is specified, 
no matter which parameter constellation is chosen. 

A correlogram file consists of two columns referring to 

1. index lag, 

2. serial correlation coefficient. 

Example. The sample project correlograms illustrates how correlograms 
are generated with SigSpec using the V photometry of IC 4996 # 89 as time 
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series input file correlograms.dat. The file correlograms.ini contains 
the line 

correlograms 100 -1 1 

which forces SigSpec to evaluate correlograms with a maximum index lag of 
100 (first parameter) for all iterations (negative value of second parameter) . 
After each iteration, a correlogram is generated (third parameter). The 
output files 

correlograms/cOOOOOO . dat 
correlograms/cOOOOOl . dat 
correlograms/ c000002 . dat 
correlograms/rescorr . dat 

are generated as displayed in Fig. 18. 



0.8- 



0.6- 
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Figure 18: Correlograms for the sample project correlograms. Solid: correlogram 
of the initial time series (file correlograms/cOOOOOO . dat). Dashed: correlogram 
after one prewhitening (file correlograms/cOOOOOl.dat). Dashed-dotted: correlo- 
gram after two prewhitenings (file correlograms/c000002.dat). Dotted: residual 
correlogram after three prewhitenings (file correlograms/rescorr .dat). 



9. Time-resolved Analysis 

In time-resolved mode, SigSpec performs an analysis for a set of time intervals 

rather than for the entire time series. An interval of width given by the keyword 
timeres : range is moved in steps the width of which is given by the keyword 
timeres : step from the beginning of the time series to the end.^ Consecutive 

time intervals are free to overlap. Time series data within such an interval are 
used to form a subset for which the analysis is performed. In addition, statistical 



®In general, the step width is shghtly modified by the software to achieve time-resolved 
analysis over the entire time series. 
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keyword 



arguments weight function 



t imer e s : w : none 
tiineres:w:ipow 



1 



t imer e s : w : gaus s 



tiineres:w:exp 

tinieres:w:damp 

timeres :w: cos 
timeres :w: cosp 




V, $ cos {2'iTv \ t - tc \ - ^) 

v,^,£, cos ^ {2t^v \t - tc\ - ^) 



else 



Table 1: Weight functions for time-resolved SigSpec analysis. The beginning of the 
time interval associated with the referring subset is denoted ts, whereas tc symbolises 
the centre of the time interval. 

weights may be applied to the subset data, all with respect to the centre of the 
interval, which shall be denoted tc- 

The only exception is the keyword timeres :w: damp. In this case, the 
analysis is optimised for signal excited at the beginning of the time interval 
corresponding to the subset under consideration, ts and exponentially damped 
towards the end of the subset. 

The weight functions of time are given in Table 1. The normalisation 
of weights is automatically performed by SigSpec. Also the combination 
of a weight function for time-resolved mode with weights columns (keyword 
col: weights) is supported. 

In time-resolved mode, the set of output files as given in "Default Output" , 
p. 25, is generated for each subset of the time series. This requires the introduc- 
tion of an additional six-digit index, #interval#, in addition to #iteration#, 
and the annotation for the output files is 

1. wts . #interval# . dat for the weight function vs. time in each subset, 

2. s#iteration#.#interval#.dat for the spectra, 

3. t#iteration#.#interval#.dat for the residuals after each step of 

prewhitening, 

4. r#iteration#.#interval#.dat for the results after each step of pre- 
whitening. 



5. m#index#.#interval#.dat for the results after each step of prewhiten- 
ing. 
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6. result. #iiiterval#.dat for the result files, each with a list of signifi- 
cant signal components, 

7. residuals.#interval#.dat for the final residuals after the prewhiten- 
ing of all significant signal components, 

8. resspec . #interval# . dat for the residual spectrum after the prewhiten- 
ing of all significant signal components. 

The column syntax is strictly consistent with the time-unresolved versions (see 
"Default Output", p. 25). The additional files, wts.#iiiterval#.dat, are in 
two-column format. The first column represents the time values in the cor- 
responding subset, the second column contains the weight function without 
normalisation. 

Furthermore, SigSpec generates a file tOOOOOO.#iiiterval#.dat, which 
contains the part of the original time series which is actually used as input. 

Special functions - as introduced in "Analysis of the Time-domain Sam- 
pling" (p. 31), "Preview" (p. 41), and "Correlograms" (p. 43) - are also supplied 
with the #interval# index, i.e. 

1. win.#interval#.dat for the amplitude windows, 

2. prof ile . #interval# . dat for the sampling profiles, 

3. sock. #interval#. dat for the Sock Diagrams, 

4. pMist.#interval#.dat for the phase distribution diagrams, 

5. preview. #interval#. dat for the previews, 

6. c#iteration#.#interval#.dat for the correlograms after each step of 
prewhitening, 

7. rescorr . #interval# . dat for the final correlograms after the prewhiten- 
ing of all significant signal components. 

Example. The sample project timeres illustrates the time-resolved anal- 
ysis using Str0mgren y photometry of the Delta Scuti star 4 CVn acquired 
with the Vienna University Automatic Photoelectric Telescope (Strassmeier 
et al. 1997). The data represent 16 nights from February 21 to March 16, 
2007, and are displayed in Fig. 19. 
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Figure 19: Time series used for the sample project timeres, representing 14 nights 
of Str0mgren y photometry of the Delta Scuti star 4CVn, acquired in February and 
March, 2007. 



The file timeres.ini contains the specifications 

timeres: range 10 
timeres: step 1 

which provide a 10-day interval moving over the time base of 24 days, 
with one-day steps. The resulting I4 subsets are represented by the files 
timeres/tOOOOOO. OOOOOO.dat to timeres/tOOOOOO . 000013 . dat. Gaus- 
sian weight functions with a standard deviation of 5 days are applied: 

timeres :w:gauss 5 

The files timeres/wts . 000000 . dat to timeres/wts . 000013 . dat contain 
the weights applied to each datapoint within each subset. Further output 
files are 

• timeres/sOGOGOG . ###### . dat for the significance spectra of the orig- 
inal time series without prewhitening (Fig. 20, 
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Figure 20: Time-resolved significance spectra for 14 subsets (from top to bottom) 
automatically generated in the sample project timeres. in each panel, the significance 
spectrum of the full dataset is displayed in grey colour for comparison. 



• timeres/result .######. dat for the lists of significant signal com- 
ponents, 

• timeres/residuals. ######. dat for the residual time series after all 
prewhitening steps (divided into subsets according to the time inter- 
vals), and 
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• t imer es /res spec. ######. dat for the significance spectra of residu- 
als. 

Here ###### denotes six-digit numbers ranging from 000000 to 000013. 

10. SigSpec AntiAIC: Anti-aliasing Correction Mode 

In AntiAIC mode, SigSpec does not follow a strict step-by-step prewhitening 
sequence. Instead, test runs are performed for a number of candidate peaks in 

the significance spectrum in order to find the solution that produces a minimum 
residual rms scatter after a user-given number of prewhitenings. 

1. All peaks above a given sig limit are taken into consideration. The key- 
word aintialc : par in the . ini file is followed by a floating-point number. 
This quantity is the AntiAIC parameter pai, which has to attain a value in 
the interval ]0, 1]. If the highest sig in the considered frequency range is 
max [sig {A)], then the sig limit ispaimax [sig [A)]. I.e., the AntiAIC pa- 
rameter determines the sig limit for the candidate peak selection relative 
to the highest peak in the spectrum under consideration. Alternatively or 
in addition, a sig threshold for the AntiAIC candidate selection may be 
defined using the keyword antialc : siglimit. If neither antialc:par 
nor antialc : siglimit are present, the sig limit specified by siglimit 
in the .ini file (p. 24) is used for the AntiAIC candidate selection also. 

2. The candidate selection is performed for each step in the test prewhitening 
sequence. 

3. The resulting procedure is the computation of all combinations of candi- 
date peaks above a sig threshold determined by the AntiAIC parameter. 
The number of iterations for these test prewhitenings is determined by 
the keyword antialc : depth, followed by an integer value. It specifies 
the depth of the AntiAIC computation. 

4. The successful combination of peaks is selected upon the minimum resid- 
ual rms deviation out of all examined combinations. 

5. SigSpec does not necessarily adopt all iterations performed in the test 
run for the main prewhitening cascade. The integer value following the 
keyword eintialc: adopt determines how many prewhitening steps shall 
be adopted. This quantity must not exceed the computation depth pro- 
vided by the keyword eintialc: depth. If the limits specified by the 
keywords iterations, siglimit, or csiglimit are reached, the out- 
put may even terminate before the number specified by the keyword 
antialc: adopt. 
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According to Reegen (2007), the expected sig is approximately proportional 
to the squared amplitude, if all influences by the time-domain sampling are 
neglected. The combination of n sinusoidal signal components interacting via 
aliasing is expected to produce a maximum amplitude that does not exceed the 
sum of amplitudes of the sinusoidal components. Consequently, the square root 
of the sig of such a combination, sig^i, is very likely below the sum of square 
roots of individual sigs sig„, 

n 

If these all are assumed equal and denoted sigj^^j, then the upper sig limit for 
the alias is sig^^^y^. In other words, if a given peak with a sig sig^^j is an 
alias of a combination of n signal components with unique sigs sigj^j, then the 
individual significances are probably higher than In terms of the AntiAIC 

parameter, one obtains 

^ (13) 



for the approximate number of signal components that can be assigned alias-free 
for a given AntiAIC parameter p^i- Based on these considerations, SigSpec 
evaluates the AntiAIC computation depth using the AntiAIC parameter, if the 
keyword antialc: depth is not provided in the .ini file. 

The AntiAIC mode produces additional screen output, if a combination of 
candidate peaks yields a lower residual scatter than the previous minimum, a 
two-line screen message is returned. The first line is a set of indices. In the 
example below, the AntiAIC parameter (keyword antialc: par) is set 0.5, and 
the AntiAIC computation depth (keyword antialc : depth.) is 3. Correspond- 
ingly, the first line of output applies to the first of altogether ten candidate 
peaks in the first iteration, the first out of three in the second iteration, and 
the first out of seven in the third iteration. This peak constellation produces 
an rms deviation of residuals as displayed in the second line of output (in the 
example 0.00 405 851). After finishing the test cascade, the number of itera- 
tions specified by the keyword antialc : adopt (in the present example, this 
number is 2) is adopted for the main cascade. The screen output produced by 
the main cascade is the same as for a normal SigSpec prewhitening cascade 
without AntiAIC. The files containing spectra and residuals, respectively, are 
updated each time the residual rms deviation improves. 

Example.^ The sample project antialc illustrates the anti-aliasing cor- 
rection using the same sampling as the data for the sample project timeres 

'^The computation of the sample project antialc takes 7 minutes on an Intel Core2 
CPU T5500 (1.66GHz) under Linux 2.6.18.8-0.9-default i686. 



p. Reegen 



51 




4165.7 4165.8 4165.9 4166.0 

HJD - 2450000 



4175.7 4175.8 4175.9 4176.0 

HJD - 2450000 



Figure 21: Time series used for the sample project antialc [dots). The sampling 
represents 14 nights of Str0mgren y photometry of the 5 Set star 4CVn, acquired 
in February and March, 2007. The magnitude values are synthesized forming two 
sinusoidal signals {solid line) plus Gaussian noise. 



(p. 46), 

1. a sinusoid with frequency 6.5598 cycles per day, amplitude 7.29 mmag, 

2. a sinusoid with frequency 8.5637 cycles per day, amplitude 6.87 mmag, 

3. Gaussian noise with 7.36 mmag rms deviation, 

as displayed in Fig. 21 . The two signal frequencies differ by almost exactly 
2 cycles per day and may easily be misidentified as aliases of each other. 
There are two identical versions of the light curve provided for comparison: 
alc.dat andantialc.dat. 

The file alc.dat corresponds to the project directory ale, representing 
a normal SigSpec run without a file alc.ini. Running SigSpec ale, the 
resulting frequencies (screen output) are 

1 freq 7.55917 slg 55.8792 rms 10.0617 cslg 55.8792 
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Figure 22: Fourier spectra for the sample project antialc. Left: significance spec- 
trum. Right: DFT amplitudes. 

2 freq 5.55706 sig 31.5539 rms 8.65888 csig 31.5539 

3 freq 10.6668 sig 11.011 rms 7.81469 cslg 11.011 

4 freq 2.55231 sig 4.9934 rms 7.60001 csig 4.9934 

Instead of the two signal components, 1-cycle-per-day aliases are identified. 
The significance and Fourier amplitude spectra of the dataset show the high- 
est peak at 7.56 cycles per day, which represents a superposition of the first 
upper side peak of the signal at 6. 56 cycles per day and the first lower side 
peak of the signal at 8.56 cycles per day (Fig. 22. This leads to an imperfect 
prewhitening of the two components, and the remaining signal is detected as 
a third com,ponent at 9.56 cycles per day. 

The alternative AntiAlC analysis is provided by the file antialc.ini, 
which contains the specifications 

antialc ;par 0.5 

antialc :depth 2 
antialc : adopt 1 
antialc : siglimit 4 

All peaks that reach at least 50 % of the highest significance in the spectrum 
are taken into account. SigSpec computes two consecutive iterations, hut 
adopts only the first of these two iterations. A sig limit of 4 is assum,ed for 
the AntiAlC calculations (contrary to the default sig limit of 5 still valid as 
a breakup condition for the whole procedure). Running SigSpec antialc, 
the screen output is 

1 freq 6.55844 sig 55.0218 rms 10.0617 csig 55.0218 

2 freq 8.56169 sig 43.6737 rms 8.68212 csig 43.6737 

3 freq 33.7207 sig 3.97249 rms 7.48075 csig 3.97249 

Both signals are recovered at a reasonable frequency accuracy. Moreover, 
according to the file antialc/result.dat, the amplitudes of the two signals 
are recovered to a satisfactory precision (7.22 mmag, 6.47 mmag). 
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11. Analysis of Harmonics 

If a non-sinusoidal, but periodic process is measured, DFT does not only pro- 
duce the fundamental frequency, which is the repetition rate of the non-sinusoid. 
The shape of the periodicity is recovered by a number of harmonics (also called 
overtones) the frequencies of which are integer multiples of the fundamental. 
In this case it may be considered insufficient to determine the exact frequency 
of the process by employing only the peak at the fundamental frequency and 
ignoring the harmonics. The keyword harmonics, followed by an integer de- 
termining the upper limit of the harmonic order, allows to compute the sig 
of the fundamental plus the desired number of overtones. The specification 
harmonics 20 forces SigSpec to take into account altogether 21 frequencies. 

As pointed out by Reegen (2007), SigSpec treats False-Alarm Probabilities 
in a statistically clean and unbiased way. In analogy to the comb analysis 
introduced by Kjeldsen et al. (1995), but benefitting from the exact statistical 
treatment of noise, it is possible to extend the method in order to evaluate the 
probability of a whole set of peaks to be generated by noise simultaneously. 
This strategy helps to take into account a fundamental frequency plus a set 
of integer multiples at once and permits to evaluate the most likely solution 
for a non-sinusoidal signal. In addition, the Fourier Space parameters obtained 
for the signal components provide a fit to the data in terms of a fundamental 
frequency plus overtones. 

Given a set of amplitude levels Ah, h = 0,1, ...,H, at different frequencies 
with associated False-Alarm Probabilities $fa(^/i), the probability that all 
amplitude levels are due to noise is given by the product of the individual False- 
Alarm Probabilities, 



if the noise amplitudes at the two frequencies are assumed statistically indepen- 
dent. This is the probability that all amplitude levels are generated by noise. 

Since the sig is defined as the negative logarithm of False-Alarm Probability, 
the above expression leads to 



In this context, the sig represents the number of cases in one out of which all 
amplitude levels Ah are not generated by noise. This logical concept is the 
representation of an AND operator, as indicated by the argument to sig in the 
equation. 




(14) 




(15) 
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Reegen (2007) evaluated the expected value of the sig (ignoring the varia- 
tions with frequency and phase) to be j lege « 5.4575. Considering H differ- 
ent amplitude levels simultaneously rescales this expected sig, so that we obtain 
^ lege. This rescaling may cause inconvenience, whence we use the mean sig 
of an individual peak out of this sample of fundamental plus harmonics, 

msig {Ah) := ;5^sig ( /\ A^] , (16) 

instead. It is the expected sig obtained for an arbitrarily picked element out 
of the H peaks: if each of the considered peaks would have msig(A/j), then 

the total sig of the fundamental plus harmonics would be sig ^Aft!=o ■ '^^^ 
statistical properties of msig (Ah) are the same as for the "normal" sig analysis. 
If the keyword harmonics is provided in the .ini file, the sig levels returned 
in the second column of the file result.dat are mean sigs. 

The result files display only the fundamentals of the solution, and infor- 
mation on the harmonics is stored in additional output files. The names are 
generated from the name of the corresponding result file without the extension 
.dat, plus -h#index#.dat, where #index# refers to the index of the item 
in the result file. For example, the harmonics for the third component in the 
file result.dat are stored in the file result-h000003.dat. The files contain 
the harmonics in ascending order, starting with the fundamental. The three 
columns are 

1. sig of the individual peak, 

2. DFT amplitude [units of observable], 

3. Fourier-space phase angle [rad]. 



ExEimple. The sample project harmonics illustrates the determination of 
a non-sinusoidal signal using the analysis of harmonics. The dataset rep- 
resents (yet unpublished) space photometry of a star that exhibits surface 
activity. The task is to determine the rotation period of the star. For com- 
parison, two identical versions of the time series are avalable (Fig. 23). The 
file noharmonics . dat is used together with the file noharmonics . ini to per- 
form a SigSpec analysis without harmonics and associated with the project 
directory noharmonics containing the output. It contains four lines: 

ufreq 13 
freqspacing .001 
iterations 1 
siglimit 



p. Reegen 



55 




Figure 23: Time series used for the sample project hairmonics. 



In this constellation, SigSpec computes the significance spectrum between 
and 13 cycles per day, with steps of 0.001 cycles per day (Fig. 24, left 
panel). Only one iteration (i. e. no prewhitening) is performed. The highest 
peak is found at 0.296 cycles per day, which corresponds to a period of 3.38 
days. 




Figure 24: Fourier spectra for the sample project harmonics. Left: significance 
spectrum without employing the analysis of harmonics {solid line). The fundamental 
and twelve harmonics of the alternative solution are indicated by vertical dashed lines 
for comparison. Right: significance spectrum displaying the mean sig for fundamental 
plus twelve harmonics {solid line). Note that the frequency interval differs from the 
left panel. For comparison, the solution without harmonics is displayed as a dashed 
line. 
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The file harmonics . dat is the same as noharmonics . dat, but the asso- 
ciated file harmonics. ini specifies a different setup by the lines 

Ifreq 0.125 
ufreq 1 

freqspacing .001 
iterations 1 
siglimit 
harmonics 12 

It is advisable not to set the lower frequency limit zero, because below 
the Rayleigh frequency resolution, consecutive harmonics hit the same peak 
and produce unreliable results. In the present case, the Rayleigh frequency 
resolution is 0. 091 cycles per day, and to be fairly on the safe side, the lower 
frequency limit is adjusted to 0.125 cycles per day. Fig. 24(right panel) con- 
tains the mean sig of the fundamental plus twelve harmonics vs. frequency. 

The amplitudes of the fundamental and twelve harmonics are displayed 
vs. frequency in Fig. 25. The maximum sig is found at 0.155 cycles per day, 
i. e., the rotation period is 6.46 days, indicating that the analysis without 
harmonics led to a misidentifi,cation of the first harmonic as the "true" 
rotational frequency. For comparison, the left panel of Fig. 24 contains the 
fundamental plus harmonics found by this procedure as vertical dashed lines. 

Moreover, for the analysis of harmonics, there is additional information 
in the screen output provided by SigSpec. The standard screen output for 
the project noharmonics contains the lines 

*** preparing to run SigSpec ******************************* 

Rayleigh frequency resolution 0.0914470160931467 

oversampling ratio 91.4470160931467433 

frequency spacing 0.0010000000000000 

lower frequency limit 0.0010000000000000 

upper frequency limit 13.0000000000000000 

Nyquist coefficient 0.9993990384615384 
number of frequencies 13000 

For the project harmonics, the corresponding output is richer. 

*** preparing to run SigSpec ******************************* 

Rayleigh frequency resolution 0.0914470160931467 

oversampling ratio 91.4470160931467433 

frequency spacing 0.0010000000000000 

lower frequency limit 0.1250000000000000 

upper frequency limit 13.0000000000000000 

Nyquist coefficient 1.0000000000000000 

number of frequencies 12876 

upper fundamental frequency 1.0000000000000000 

number of fundamental frequencies 876 



Although the upper frequency limit is set 1 cycle per day by the keyword 
ufreq, SigSpec has to compute the Fourier spectrum up to a frequency of 13 
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Figure 25: Frequencies and amplitudes of the harmonics associated to the most sig- 
nificant signal found for the sample project harmonics {dots with drop lines). The 
DFT amplitudes obtained by SigSpec without employing the analysis of harmonics 
are displayed as a solid line for comparison. 

cycles per day in order to cover also the 12 harmonics. Two additional lines 
are provided corresponding to the upper limit for the fundamental frequen- 
cies, which is related to the specification by uf req in the file harmonics . ini, 
and the number of fundamental frequencies. 

12. MultiFile Mode 

12.1. How to handle multiple time series 

An additional feature of SigSpec is the ability to handle multiple time series 
input files at once. This increases the performance of the program significantly, 
if the time values in all input files are identical. 

• The user has to provide only one project directory <project> - just as 
in SingleFile mode (as described in "Projects", p. 7). 
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• Parameter specifications in the file <project> . ini are uniquely applied 
to all time series input files. Thus SigSpec expects the same column 
format for all time series input files and applies the settings specified in 
the .ini file to all input files. 

• Time series files have to be indexed as #multif ile#. <project> .dat, 
where #multif ile# represents a six-digit index starting with 000000. 
Note that strictly ascending indices are required. 

• All output files are supplied with the leading index #multif ile#. For 
example, 000012 . s000009 . 000002 denotes the significance specturm 
the second iteration for time interval number nine in a time-resolved 
analysis of the 12th file in MultiFile mode. 

• The MultiFile mode is activated by the keyword multifile, followed by 
an integer value. This value is interpreted as the maximum index up to 
which the calculations shall be performed. This permits a restriction for, 
e.g., test runs. If the index limit is assigned a negative value, SigSpec 
analyses as many files as available. 

• The sampling profile of the file 000000. <project> .dat is always written 

to a file. For subsequent and consistent time series, the sampling profile 
is taken from this file, which saves computation time. Only if the new 
time values are inconsistent with those of the precursor, the profile is 
re-calculated and stored in a corresponding output file for later use. The 
keyword profile in the .ini file is ignored in MultiFile mode. 

In MultiFile mode, SigSpec terminates, if a #multif ile# index is reached, 
for which no time series input file is available. 

A further keyword to restrict the MultiFile analysis is mf start, which per- 
mits to specify a MultiFile index to start with (instead of 0). 

Example. The two lines 

mfstart 4 
multifile 16 

activate the MultiFile mode for input files from 000004. <project>. dat to 
000016. <project>. dat. 

The big advantage of the MultiFile mode is that sampling profiles are 
computed only if necessary. If the time-domain sampling is identical to a 
previously examined time series, the sampling profile of this time series is 
used. If the keyword profile is set in the .ini file, a file assign.log is 
generated. It contains a table of assignments between time series file indices 
and profile indices. 
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Example. The line 

000013 000002} 

in the file assign.log means that for 000013. <project>.dat, the pro- 
file with index 000002 is used. 

Example. The sample project multifile illustrates the simultaneous anal- 
ysis of multiple time series. The project contains 10 time series files from 
000457.multifile.dat to 000467.multifile.dat. However, the files do 
not represent a complete sequence, since 000466.multifile.dat is missing. 
The lines 

mfstart 457 
multifile 467 

in the file multifile.ini would force SigSpec to process the complete 
sequence of time series input files. Indeed, the program starts with the file 
000457.multifile.dat and proceeds wnti^ 000465 .multifile . dat. Since 
the next file, 000466.multifile.dat is missing, it stops its calculations 
with 000465.multifile.dat and displays a corresponding warning: 

Warning: MultlFlle_Count 002 

MultlFlle limit exceeds number of available 

time series Input flies, limit re-adjusted to 465. 

The keyword profile in the file multifile . ini forces SigSpec to gen- 
erate the following files in the project directory: 

000457 . prof lie . dat 

000458 . prof lie . dat 
000460 . profile . dat 
000463 . profile . dat 

The reason why only four profiles are computed for nine time series is found 
in the file assign.log; 

time series Input file profile and spectral window 

000457 000457 

000458 000458 

000459 000457 

000460 000460 

000461 000457 

000462 000457 

000463 000463 

000464 000457 

000465 000457 
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The contents of this file tell the user that the samplings of all time series files 
are identical, except for those with indices 000458, 000460 and 000463. In 
order to speed up the computations, SigSpec generates only one profile for 
the files with identical sampling and re-uses this profile for all of them. The 
first file with this sampling is 000457.multifile.dat, and the associated 
profile is also used for 

000459 .multifile . dat 

000461. multifile.dat 

000462 . mult If lie . dat 

000464. multifile.dat 

000465 . multifile .dat 

// the keyword win is added to the file multifile . ini, this assignment 
applies to the files containing the spectral windows as well. 

12.2. Differential significance spectra 

Practical astronomical time series analysis occasionally comes along with target 
and comparison datasets that show coincident peaks in the DFT amplitude 
spectra. In this case, SigSpec provides a possibility to compute the probability 
that a peak in the target dataset is significant in spite of a given peak in the 
comparison dataset. Moreover, multiple target and/or comparison datasets may 
be handled the same way. The idea is to identify common (instrumental and/or 
environmental) effects and to distinguish them from periodicities exclusively 
found in a target dataset. 

In the .ini file, there are three different keywords reserved for the specifi- 
cation of dataset types. Each expects one integer parameter representing the 
MultiFile index of the dataset under consideration. 

1. The keyword target specifies a target dataset. 

2. The keyword comp specifies a comparison dataset. 

3. The keyword skip specifies a dataset to be ignored. 

To enhance the convenience for the user, not all files need to be specified. The 
keyword deftype may be used to assign a default dataset type. 

1. Use deftype target to assign the target attribute by default. If no 
deftype keyword is provided, this setting is activated. 

2. Use deftype comp to assign the comp attribute by default. 

3. Use deftype skip to assign the skip attribute by default. 
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Sampling profiles need to be computed for target datasets only. If the key- 
word profile is given in the .ini file, sampling profiles will only be generated 
for target datasets, and the file assign. log will also contain target datasets 
only. 

To make datasets comparable even if their quality is different, the DFT 
spectra of the comparison datasets are scaled according to the power integral 
over the entire frequency range under consideration. 

Instead of the observables Ck, k = 0,1, ...,K of a comparison dataset, the 
transformed quantities 

are used, where xi, I = Q,1, L denotes the observables of the target dataset 
under consideration and P indicates the power integral of the quantity in paren- 
theses. A DFT is calculated for each comparison dataset. There are two options 
to determine the resulting amplitude At to be compared to the target amplitude 
A. 

By default, the sig measures the probability of a peak generated by noise at 
the same variance as that of the given time series, in case of computing differ- 
ential sigs, the normalisation has to be modified, since part of the power found 
in the target spectrum is assumed due to corresponding power in a comparison 
spectrum. To take this into account appropriately, a factor 

is introduced, where dP is the power integral of the difference between the 

target data and the transformed comparison data. Correspondingly, the differ- 
ential sig is a measure of the additional power with respect to the comparison 
dataset to be due to noise. 

1. If the keyword diff :comp is set in the .ini file, a weighted arithmetic 
mean of the Fourier vectors, averaged over all comparison datasets is used 
to calculate At- The numbers of data points the comparison datasets 
consist of are used as weights. This option considers signal common 
among the comparison datasets only if the phases are aligned. Follow- 
ing the formalism by Reegen (2007), the cartesian representation of the 
differential sig evaluates to 

K log e 

sig (azM, bzM I w) = 7 / 2\ ^ 
\x ) 

(azM - «tzm) cosgo -I- {bzM -brzu) singp 
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(azM - aTZM)singo - (6zm - ^tzm) cos6>o 
/3o 




(19) 



2. If the keyword diff : compalign is set in the . ini file, a weighted arith- 
metic mean of the DFT amplitudes, averaged over all comparison datasets 
is considered as At- The numbers of data points the comparison datasets 
consist of are used as weights. This option considers signal common 
among the comparison datasets also if they lag in phase. The differential 
sig is obtained through 



following the annotation introduced by Reegen (2007). 

The default setting is diff : off, which switches off the computation of 

differential sigs. 

Additional output is provided in the spectra (see p. 28), where columns 6 
and 7 contain the DFT amplitudes and phases of the transformed comparison 
dataset, respectively. 

Example. The sample project diff sig illustrates the analysis of target and 

com,parison time series using differential significance spectra. There are nine 
time series input files available, indexed from 000038 through 000046. The 
file diffsig.ini contains the lines 

mfstart 38 
multifile -1 

which forces SigSpec to start with the file 000038.diffsig.dat and com- 
pute all available datasets. In this case, SigSpec takes into account all files 
from 000038 to 000046. The two lines 

deftype target 



in the file diffsig.ini define the file 000038.diffsig.dat as a comparison 

dataset and the rest as targets. Thus differential .significance spectra are 
calculated for all time series from 000039 through 000045, with respect to 
000038 as comparison data. The calculation of differential sigs is activated 
by the line 



sig (A I At) = 7 



K {A ~ At floge I cos"^ {9 - Oq) sin^ {9 - 9o) 
4(x2) [ al + Pi 



(20) 



comp 38 



diff : compalign 
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in the file diffsig.ini, which produces differential sigs without respect to 
phase lags between comparison and target signals. The computations are 
made faster by the lines 

ufreq 7 
siglimit 
iterations 1 

The sampling of the input file 000038.diffsig.dat represents the V 
photometry of IC 4996 #89 (see Example SigSpecNative, p. 8), and the 

observable is a synthetically generated signal with unit amplitude at a fre- 
quency of 3.125 cycles per day, plus Gaussian noise with 5 units rms devi- 
ation. The corresponding significance spectrum, as obtained by typing 

SigSpec 000038. diffsig 

is displayed in the bottom panel of Fig. 26. The five upper panels contain 
the differential significance spectra of the time series 000039 to 000046. 
These datasets contain 11 649 points and are based on the sampling used in 
the project harmonics (p. 54). Gaussian noise with a standard deviation of 
100 units is generated. Just as in case of the comparison data, a sinusoid 
at 3.125 cycles per day is synthesized, but the phase is not the sam,e as for 
000038 . diffsig . dat. The amplitudes of this signal are 5 units for 000039, 
6 units for 000040, 7 units for 000041, 8 units for 000042, 9 units for 
000043, 10 units for 000044, 11 units for 000045, and 12 units for 000046. 
With increasing signal amplitude in the target data, the differential sig of the 
main peak consistently increases. In Fig. 26 the datasets 000039 to 000046 
are displayed from bottom to top. 

13. The Built-in Simulator 

SigSpec contains a simulator to generate and analyse synthetic time series. 
To activate the simulator, a sequence of keywords may be given in the . ini file 
to generate a variety of datasets. The sampling is taken from the time series 
input file. 

The simulator activities specified by sim: signal, sim:poly, sim:exp, 
sim:serial, sim: temporal, sim:rndsteps, and sim:zeromean are inter- 
preted as a sequence and performed step by step, following their order in the 
.ini file. SigSpec generates the synthetic light curve by performing all spec- 
ified actions following the order of occurrence in the .ini file. 

The synthetic time series is saved as a file with the same name as the input, 
but in the project directory, to avoid accidential overwriting of original data. 
If the time series input file is named <project>.dat, then the synthetic time 
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1 2 3 4 5 6 



Figure 26: Differential significance spectra for the sample project diff sig. Bottom: 
significance spectrum of comparison data, representing a sinusoidal signal at 3.125 
cycles per day [grey line), plus Gaussian noise. Top eight panels: Differential signifi- 
cance spectra for target time series representing the Gaussian noise plus a sinusoidal 
signal at 3.125 cycles per day. Both the time-domain sampling and the signal phase 
differ from the comparison data. From bottom to top, the amplitude of this signal 
increases. 

series is <project>/<project>.dat. In MultiFile mode, if the time series 
input files are named #multif ile#.<project>.dat, the synthetic time series 
are <pro j ect/ #multif ile# . <pro j ect> . dat . 

13.1. The simulator mode 



SigSpec supports two different simulator modes. 
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1. The keyword sim:add runs the simulator in additive mode. The program 
keeps the original observable values and adds the synthetic values. For 
example, this function is useful to add synthetic noise to a given time 
series. 

2. The keyword sim: replace forces the simulator to overwrite the original 
observable values with the synthetic values. 

3. The keyword sim: off is used, if no simulator activity is desired. Since 
the simulator is deactivated by default, this keyword is redundant and 
only implemented for completeness. 

13.2. Random numbers 

The SigSpec simulator is capable of modelling three different types of random 
processes: 

• serially correlated noise (keyword sim: serial, p. 72), 

• temporally correlated noise (keyword sim:temporial, p. 74), 

• random steps (keyword sim:rndsteps, p. 76. 

The random number generator employed for these models may be initialised 
in two different ways. 

1. The user may pass an integer value to the program. This value has to be 
written into a file <project>.rnd. 

2. If the file <project>.rnd is not present, the simulator initialises the 
random number generator using the system time. 

The last integer value in the sequence of random numbers is written to a file 
<project>/<project> .rnd. This allows to embed SigSpec into an external 
loop for numerical simulations. If the output file <project>/<project>.rnd 
is moved to <project>. rnd externally between consecutive SigSpec runs, the 
program may used iteratively without breaking the random number sequence. 

Example. The simulator is employed in the sample projects sim-serial, 
sim-temporal and sim-rndsteps. To initialise the random number gener- 
ator, a file sim-serial. rnd, sim-temporal. rnd and sim-rndsteps. rnd, 

respectively, is used to make the output reproducible. 

Consequently, the user has three options to explore the these samples. 

1. If the samples are processed as they are, SigSpec reproduces the given 
output exactly. 
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2. If the .rnd file in the input directory is removed by the user, SigSpec 
produces a new set of random numbers. The random number generator 
is initialised employing the system time. 

3. If the content of the . rnd file in the input directory is modified by the 
user, SigSpec produces a new set of random numbers. The random 
number generator is initialised employing the new number in the . rnd 
file. 

13.3. Sinusoidal signal 

The keyword sim: signal is given with five floating-point parameters. They 

specify 

1. the lower time limit, 

2. the upper time limit, 

3. the amplitude, 

4. the time zeropoint (a fixed time where the signal shall attain a maximum), 
and 

5. the frequency [inverse time units]. 

If the lower and upper time limits are both set zero, the signal is generated for 
the entire time base. 

Example. The sam,ple project sim-signal contains the sim,ulation and 
analysis of two sinusoidal signals, one over the entire time base, one on 
a restricted time interval. In this sample project, the V photometry of 
IC 4996 #89 (see Example SigSpecNative, p. 8) is modified, according to 
the line 

sim: add 

in the file sim-signal. ini. The line 

siiii:signal 0.00727 1511. iZil 4.68573 

produces a sinusoidal signal over the entire time base (corresponding to the 
first two arguments being zero). The amplitude is 1.21 mmag, and the fre- 
quency is 4-68513 cycles per day. At HJD 2452521.4542 the sinusoid shall 
attain zero value. Correspondingly, the line 

siiii:signal 2521 2525 0.00543 2524.2356 6.24512 
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Figure 27: Time series generated by the simulator in the sample project sim-signal. 
Open circles: Original V photometry of IC 4996 #89. Dots: Two sinusoidal signals 
added by the simulator. 



is associated to a sinusoid with amplitude 5.43 mmag, frequency 6.24512 cy- 
cles per day, and a zeropoint at HJD 2452524-2356. This signal is not gen- 
erated for the entire time base but only from, HJD 2452521 to HJD 2452525. 
Fig. 27 displays the light curves of the original and the synthetic data. 

The screen output contains the lines 

*** simulator; add ***************************************** 

signal 
signal 

indicating that the simulator adds the synthetic values to the original 
observables, and that two sinusoids are generated. 

Fig. 28 compares the Fourier spectra of the synthetic time series to those of 
the original time series (as used in Example SigSpecNative, p. 8, and displayed 

in Fig. 2, p. 12. Both signals introduced by the simulator are identified, but 
the prewhitening of the component at 6.25 cycles per day is performed over 
the whole time base, although the signal is present only in an interval. This 



68 



SigSpec User's Manual 




/[d'l /[d'] 

Figure 28: Fourier spectra for the sample project sim-signal. Left: sig- 
nificance spectra. Right: DFT amplitudes. Top: original spectra (file 
SigSpecNative/sOOOOOO.dat). Bottom: spectra with two sinusoidal signals added. 
All spectra are plotted grey. The significant components are indicated by black 
dots with dashed drop lines (file SigSpecNative/result.dat for the top panels, 
file sim-signal/sOOOOOO.dat for the bottom panels). The default sig threshold of 
5 is represented by a horizontal dashed line in the left panels. 

introduces additional noise, which causes the signal at 3.99 cycles per day 
to drop heloiu the significance limit of 5 and avoids the detection of the 
component at 5.^1 cycles per day. 

13.4. Polynomial trend 

The keyword siin:poly is given with five floating-point parameters. They 
specify 

1. the lower time limit, 

2. the upper time limit, 

3. the coefficient Pq, 
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4. the time zeropoint to, and 

5. the exponent X. 

If the exponent is a non-integer number, the simulator evaluates 

P{t):^Po\t-to\'' (21) 

instead and produces a power function. 

For integer exponents, the trend is generated by the relation 

P{t):=Po{t-tof . (22) 

Thus a full polynomial may be constructed by multiple keywords siin:poly with 
different parameters and integer exponents. 

If the lower and upper time limits are both set zero, the polynomial trend 
is generated for the entire time base. 
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Figure 29: Time series generated by the simulator in the sample project sim-poly. 
The sampling represents the V photometry of IC 4996 #89. The simulator replaces 
the origninal observable by 16 different power functions. 



Example. The sample project sim-poly contains the simulation and anal- 
ysis of 16 individual power functions defined on different time intervals 
(Fig. 29, p. 69). The sampling of the V photometry of IC4996 # 89 is used, 
and the simulator replaces the original observable values, according to the 
line 

Sim: replace 

in the file sim-poly . ini. The specifications for the power functions are 
contained in the lines 
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Sim; poly 
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The screen output contains the lines 

*** simulator: replace ************************************* 

polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 
polynomial trend 

to indicate that the simulator replaces the original observables by the 
synthetic values, and that 16 power functions are generated. 

SigSpec detects 19 significant signal components, which are not dis- 
cussed here. 

13.5. Exponential trend 

The keyword sim:exp is given with five floating-point parameters. They specify 

1. the lower time limit, 

2. the upper time limit, 

3. the coefficient Eq, 



4. the time zeropoint and 
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5. the exponent X. 
The polynomial trend Is generated by the relation 

E {t) := Eo e^^*-*") . (23) 

If the lower and upper time limits are both set zero, the exponential trend Is 
generated for the entire time base. 
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Figure 30: Time series generated by the simulator in the sample project sim-exp. 
The sampling represents the V photometry of IC 4996 #89. The simulator replaces 
the origninal observable by two exponential functions, one over the entire time base, 
and the other one on an interval between HJD 2452521.4532 and HJD 2452526.8832. 



Example. The sample project sim-exp contains the simulation and anal- 
ysis of two exponential trends, one over the entire time base, one on a 
restricted time interval, corresponding to the lines 

siiii:exp 2521.4532 2526.8832 1.3256 2526.7384 0.65834 
siiii:exp 2.2841 2520.8562 -0.03425 

in the file sim-exp . ini. The sampling of the V photometry of IC 4996 # 89 
is used, and the simulator replaces the original observable values, according 
to the line 

simireplace 

The screen output contains the expression exponential trend to indicate 
that such a trend is generated. In this example, the entry is found twice. 
The resulting light curve is displayed in Fig. 30, p. 71. 

SigSpec detects 54 significant signal components, which are not dis- 
cussed here. 
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13.6. Serially correlated noise 

This simulator module produces Gaussian noise the standard deviation of which 
may vary in time according to a polynomial trend. A serial correlation coefficient 
between consecutive data points may be specified additionally. 

The keyword sim: serial is given with six floating-point parameters. They 
specify 

1. the lower time limit, 

2. the upper time limit, 

3. the coefficient ao for the standard deviation of the Gaussian noise, 

4. the time zeropoint to for the polynomial trend of the standard deviation, 

5. the exponent X for the polynomial trend of the standard deviation, and 

6. the serial correlation coefficient. 

The standard deviation of the Gaussian noise follows the relation 

a{t):=ao{t-tof . (24) 

A full polynomial may be constructed by multiple keywords sim: serial 
with different parameters. 

If the lower and upper time limits are both set zero, the noise is generated 
for the entire time base. 

Example. The sample project sim-serial contains the simulation and 
analysis of serially correlated noise. The sampling of the V photometry 
of IC4996 # 89 is used, and the simulator replaces the original observable 
values, according to the line 

sim: replace 

in the file sim-serial. ini. The line 

Sim: serial 1 0.8 

specifies noise with a constant standard deviation of 1 and a serial correla- 
tion coefficient of 0.8. Setting the first two parameters zero provides syn- 
thetic data for the entire time series. The resulting light curve is displayed 
in Fig. 31 . The line 



random number generator: file sim-serial . rnd 
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Figure 31: Time series generated by the simulator in the sample projects sim-serial 
{dots) and sim-temporal (open circles), respectively. The sampling represents the 
V photometry of IC 4996 #89. In both samples, the original observable values are 
replaced by the simulator. 




Figure 32: Typical significance spectrum for serially correlated noise, based on the 
sampling of the V photometry of IC 4996 #89. Serial correlation produces systemat- 
ically higher sigs in the low frequency region. 
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in the screen output indicates that a file sim-serial . rnd is found and used 
to initialise the random number generator. If such a file were not present, 
the system time would be used: 



A significance spectrum is displayed in Fig. 32. The overall shape of 
the spectrum is typical for serially correlated noise, characterised by higher 
amplitudes and sigs for low frequencies. 

13.7. Temporally correlated noise 

This simulator module produces Gaussian noise the standard deviation of which 
may vary in time according to a polynomial trend. A temporal correlation 
coefficient Rt between consecutive data points t„_i, i„ may be specified. In 
contrary to the serial correlation, the temporal correlation takes into account the 
width of the time interval between pairs of data points, which has implications 
on the noise behaviour of non-equidistantly sampled data. The serial correlation 
Rs drops exponentially with the distance in time according to 



In this context, the temporal correlation coefficient may be interpreted as the 
serial correlation coefficient of two data points separated by one unit of time. 

The keyword sim: temporal is given with six floating-point parameters. 
They specify 

1. the lower time limit, 

2. the upper time limit, 

3. the coefficient ctq for the standard deviation of the Gaussian noise, 

4. the time zeropoint to for the polynomial trend of the standard deviation, 

5. the exponent X for the polynomial trend of the standard deviation, and 

6. the temporal correlation coefficient Rt- 

The standard deviation of the Gaussian noise follows the relation 



random number generator: system time initialisation 



Rs := Rt 



-1 



(25) 



a (t) := C70 {t - to) 



X 



(26) 



A full polynomial may be constructed by multiple keywords sim: temporal with 
different parameters. 
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Figure 33: Typical significance spectrum for temporally correlated noise, based on 
the sampling of the V photometry of IC 4996 #89. Temporal correlation produces 
systematically higher sigs in the low frequency region, which is quite comparable to 
serial correlation (Fig. 32). 

If the lower and upper time limits are both set zero, the noise is generated 
for the entire time base. 

Example. The sample project sim-temporal contains the simulation and 
analysis of temporally correlated noise. The sampling of the V photometry 
of IC4996 # 89 is used, and the simulator replaces the original observable 

values, according to the line 

sim:replace 

in the file sim-temporal. ini. The line 

simttemporal 1 0.01 

specifies noise with a constant standard deviation of 1 and a temporal cor- 
relation coefficient of 0.01. Setting the first two parameters zero provides 
synthetic data for the entire time series. The resulting light curve is dis- 
played in Fig. 31. Comparing this light curve to the dataset generated in the 
project sim-serial (p. 72), the correlation between consecutive data points 
is obviously much stronger in the present example. Using Eq. 25with a typ- 
ical sampling interval width of 9 min for the dataset under consideration, 
the temporal correlation coefficient of 0.01 corresponds to a serial correlation 
coefficient of^v 0.97. 

The significance spectrum displayed in Fig. 33 shows the same overall 
characteristics as the corresponding spectrum for serially correlated noise 
(Fig. 32, but the sigs at low frequencies are considerably higher, which is a 
consequence of the strong serial correlation associated to this setup. 
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13.8. Random steps 

This module generates steps following two random processes: 

1. the constant attained by the synthetic observable throughout each step 
follows a Gaussian distribution with an expected value 0, 

2. a Poisson process is used to define when a step has to be incorporated. 

The keyword simirndsteps is given with four floating-point parameters. 
They specify 

1. the lower time limit, 

2. the upper time limit, 

3. the standard deviation of the Gaussian distribution defining the constants 
attained throughout each step, 

4. the expected time range for the Poisson distribution of steps. 

If the lower and upper time limits are both set zero, the steps are generated for 
the entire time base. 

Example. The sample project sim-rndsteps illustrates the simulation 
and analysis of random steps upon the sampling of the V photometry of 
IC 4996 # 89. The simulator replaces the original observable values, accord- 
ing to the line 

siiii:replace 

in the file sim-rndsteps. ini. The line 

siiii:rndsteps 0.5 0.07 

in the file sim-rndsteps.ini produces random steps the values of which 
are distributed according to a Gaussian with standard deviation 0.5. The 
expected distance in time of consecutive steps 0.07 days. The resulting light 
curve is displayed in Fig. 34. 

Since the observables are constant between the steps, one may consider 
each of the corresponding time intervals to contribute a spectral window 
to the DFT, or significance spectrum, correspondingly. The significance 
spectrum associated to the light curve in Fig. 34 is displayed in Fig. 35 and 
respresents such a superposition of spectral windows. 
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Figure 34: Time series generated by the simulator in the sample project 
sim-rndsteps. The sampling represents the V photometry of IC 4996 #89. The 
original observable values are replaced by the simulator. 
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Figure 35: Typical significance spectrum for random steps, based on the sampling of 
the V photometry of IC 4996 #89. Each constant in the step function displayed in 
Fig. 34 contributes a spectral window to this DFT. 

13.9. Zero-mean adjustment 

The keyword sim:zeromean may be used to adjust the mean value of the time 
series (or a subset) to zero. It is given with two floating-point parameters, 
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1. the lower time limit, and 

2. the upper time limit. 

If the lower and upper time limits are both set zero, the mean value of the 
entire synthetic time series is adjusted to zero. This option was adopted for 
consistency, but does not provide additional functionality, because a zero-mean 
correction of the whole data set is performed at every step of the prewhitening 
cascade by default. 

Example. In the sample project sim-zeromean, SigSpec models the same 
time series as in the project sim-poly (p. 69), according to the first part of 
the file sim-zeromean. ini; 
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Figure 36: Time series generated by the simulator in the sample project 
sim-zeromean. The sampling represents the V photometry of IC 4996 #89. First 
the simulator generates a set of power functions over intervals within the time series 
(grrey), then the actual light curve [black) is produced by shifting the mean observable 
for each power function to zero individually. 
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This block of sim:poly keywords is followed by a corresponding block of 
simizeromean keywords: 
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This block is responsible for shifting the mean observable to zero for each 
synthesized power function. 

Fig. 36 com.pares the corresponding light curve with the light curve gen- 
erated in the project sim-poly. (See also Fig. 29) The 16 significant signal 
components detected by SigSpec are of minor interest and not discussed 
here. 

14. Signal-to-Noise Ratio and Lomb-Scargle Periodogram 

As pointed out by Reegen (2007), the SigSpec method represents a tool for 
an iterative frequency analysis of a zero-mean corrected time series superior 
to signal-to-noise ratio estimation (Breger et al. 1993) and Lomb-Scargle peri- 
odogram (Lomb 1976; Scargle 1982). However, in some situations these alter- 
native methods may be desired or even more reasonable. Namely the Lomb- 
Scargle periodogram represents the optimum statistical approach to the prob- 
lem if the mean observable is meaningful rather than set zero arbitrarily. The 
relations betw/een sig and signal-to-noise ratio or Lomb-Scargle periodogram, 
respectively, are introduced and discussed by Reegen (2007). 

In order to meet a user's requirement of signal-to-noise ratio-based DFT 
analysis or Lomb-Scargle periodograms as well, the SigSpec software offers 
the option to perform an analysis relying on amplitude signal-to-noise ratios 
by providing the keyword DFT in the . ini file. If this keyword is specified, 
all SigSpec computations rely on the approximation of sig by the amplitude 
signal-to-noise ratio according to 



(27) 
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where K represents the number of time series data, A denotes the Fourier 
amplitude, and (a;^) refers to the variance of the observable. 

Second, the keyword Lomb forces SigSpec to evaluate Lomb-Scargle peri- 
odograms rather than significance spectra. In this case, the sig is approximated 
by 

. / .^ if lege Pls 

sig(^)«^^, (28) 

where Pls denotes the power level in terms of the Lomb-Scargle periodogram. 

Example. In the sample projects DFT and L-S, the input time series rep- 
resents the V photometry of IC 4996 # 89. 
The file DFT . ini contains a single entry 

DFT 



which forces SigSpec to rely on the signal-to-noise ratio of DFT amplitudes. 
The screen output is: 

1 freq 3.13205 sig 9.75026 rms 0.00449592 csig 9.75026 

2 freq 3.98473 sig 6.80132 rms 0.00422861 csig 6.80083 

3 freq 5.40684 sig 5.31609 rms 0.0040257 csig 5.30209 

4 freq 17.3677 sig 4.1816 rms 0.00388775 csig 4.14988 




0123456789 10 



/ 

Figure 37: Significance spectrum of the V photometry of IC 4996 #89 {grey) and 
approximation by the signal-to-noise ratio of DFT amplitudes {black). 

The file L-S . ini contains a single keyword 

Lomb 

and SigSpec uses the Lomb-Scargle periodogram rather than sig for all com- 
putations. The screen output is: 
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Figure 38: Significance spectrum of the V photometry of IC4996#89 {grey) and 
approximation by the Lomb-Scargle periodogram (black). 



1 freq 3.13205 slg 9.75026 rms 0.00449592 csig 9.75026 

2 freq 3.98472 sig 6.79398 rms 0.00422861 csig 6.7935 

3 freq 5.40684 sig 5.31451 rms 0.0040257 csig 5.30033 

4 freq 17.3677 sig 4.18161 rms 0.00388775 csig 4.14977 

The significance spectrum of the input time series is compared to the 
approximations by DFT amplitude signal-to-noise ratio and Lomb-Scargle 
periodogram in Figs. 37 and 38, respectively. 

A comparison of the two outputs and the screen output of the correspond- 
ing sig-based application (Example SigSpecNative, p. 10) reveals slightly 
different signal components. Especially for the, second component the fre- 
quency of which is close to an integer multiple of 1 cycle per day and there- 
fore susceptible to alias, the results are different for all three methods. How- 
ever, the frequencies, amplitudes and phases in the files result.dat are 
in good agreement and reflect the numerical uncertainties of the MultiSine 
fitting procedure only. 



15. Frequently Asked Questions 

This section contains questions frequently asked by users familiar to common 
methods of astronomical time series analysis involving signal-to-noise ratio es- 
timation in power spectra and consecutive prewhitenings. The intention is to 
clarify the differences between these classical techniques and SigSpec from the 
user's perspective. 



15.1. Changing sig in a prewhitening sequence 

Given a time series showing two different peaks in the power spec- 
trum, prewhitening of the dominant signal usually does not cause 
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a major change in the height of the secondary peak in the spec- 
trum of the residuals. Why does the corresponding sig chemge? 




4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 
fid'] fid'] 



Figure 39: Grey graphs: sig {left) and power (squared amplitude) spectra (right) of a 
synthetic time series containing two signals plus noise. The sampling represents the 
V photometry of IC 4996 #89. S/ac/c graphs: spectra after subtracting the dominant 
signal (/ = 4.68573 d"^). 

The situation is illustrated in Fig. 39 displaying the sig {left panel) and power 
{right panel) spectra (in this sample just squared amplitude) generated by a 
synthetic time series. It consists of two sinusoidal signals, /i = 4.68573 d~^, 
Ai = 7.27, and /2 = 5.26934 d~\ A2 = 3.31, plus noise with unit rms 
error. The plots contain a comparison of the initial spectra {grey) and the 
spectra after subtraction of the first signal component. In the right panel, the 
power associated to the peak at 5.27 differs only slightly between the two 
iterations, whereas the corresponding sig in the left panel increases dramatically 
in the second iteration. 

The reason for this behaviour is that the sig refers to the probability of a 
random time series with the same rms error as the given one to produce a peak 
like the given one. In the first iteration, the sig calculation is based on the 
initial time series (rms error 5.84), and in the second iteration, it relies on the 
residual time series after prewhitening of the peak at /i, the rms error of which 
is 2.46. The ratio of rms errors (w 2.4) is in agreement with the root ratio of 
sigs at /2 in the two iterations (« 2.2). 

This effect is more prominent for high sigs, because in this case prewhitening 
causes a major change in the statistical properties of the time series. If a 
peak with a low sig is prewhitened, the time series is affected marginally, and 
correspondingly, the sigs of other signals do not change much. 



15.2. The effect of binning 



Consider a time series representing a signal plus noise. If the data 
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points are grouped into bins, the noise of the binned observables 
will reduce by the square root of the number of points in each 
bin. On the other hand, the number of data points the time 
series consists of reduces by the same amount. Since these two 
effects cancel each other, the noise level in the power spectrum 
will be the same for unbinned and binned data. What is the 
corresponding situation in terms of significance? 




Figure 40: Grey graphs: sig {left) and power (squared amplitude) spectra [right) of a 
synthetic time series (100 equidistant data points) containing a sinusoidal signal plus 
noise with a standard deviation of 1. The signal amplitude is 0.5. S/sc/c graphs: same 
for time series data grouped into bins of two points. The resulting time series consists 
of 50 data points. 



Fig. 40 contains the significance (/eft) and power (squared amplitude) spec- 
tra (right) of a synthetic time series containing a sinusoidal signal with a fre- 
quency of 0.075832 plus Gaussian noise with a standard deviation of 1. The 
signal amplitude is 0.5, providing an amplitude signal-to-noise ratio of 5.64. All 
corresponding plots are displayed in grey colour. The black graphs represent 
the spectra generated by a binned version of the time series: each bin contains 
two data points, and the observable is the arithmetic mean. 

In terms of sig as well as amplitude, binning affects neither the peak nor the 
mean amplitude remarkably: the reduced number of data points would increase 
the amplitude noise, but this effect is mitigated by the fact that binning reduces 
the rms residual in the time domain. For a multi-sine signal plus white noise, the 
number of significant peaks in a given frequency range will hardly be modified 
by data binning. A considerable change of these sigs by binning is an indication 
of the noise not being white. A correlation between consecutive measurements 
in the time series would be a reasonable explanation for such a behaviour. 
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15.3. Binning of extremely strong signals 

If an extremely strong signal is binned, the sig changes, wherecis 
the signal amplitude and the noise level do not. Why? 




Figure 41: Grey graphs: sig (left) and power (squared amplitude) spectra (right) of a 
syntlietic time series (100 equidistant data points) containing a sinusoidal signal plus 
noise with a standard deviation of 1. The signal amplitude is 10. Black graphs: same 
for time series data grouped into bins of two points. The resulting time series consists 
of 50 data points. 

Fig. 41 contains the significance (left) and power (squared amplitude) spec- 
tra (right) of a synthetic time series containing a sinusoidal signal with a fre- 
quency of 0.075832 d^^ plus Gaussian noise with a standard deviation of 1. 
The signal amplitude of 10 is associated to an ampltiude signal-to-noise ratio 
of more than 100. All corresponding plots are displayed in grey colour. The 
black graphs represent the spectra generated by a binned version of the time 
series: each bin contains two data points, and the observable is the arithmetic 
mean. 

For both strong and weak signals, binning affects neither the peal< nor the 
mean amplitude remarkably: the reduced number of data points would increase 
the amplitude noise, but this effect is mitigated by the fact that binning reduces 
the rms residual in the time domain. 

In terms of sig the situation is different: for very strong signals, the peak 
sig is reduced by binning. Classical techniques prewhiten a peak under consid- 
eration and employ the residuals to estimate a noise level. SigSpec does not 
imply any prewhitening. In the case of a dominant signal plus a tiny scatter, the 
unbinned and binned data have comparable rms deviations, which are mainly 
determined by the signal. In the frequency domain, only the reduced number 
of binned data points comes into play. 

Very strong signals let the sig drop to « by forming groups of N data 
points: in Fig. 41, left panel, the grey peak is about twice as high as the black 



p. Reegen 



85 



peak. 



15.4. Linear interpolation: more information? 

Consider a time series representing a signal plus noise. Generat- 
ing additional data points through linear interpolation increases 
the sig of the signal peak, although the power spectrum remains 
practically unchanged. This provides the possibility to boost sig- 
nal sigs artificicdly, although the cumount of information contained 
by the time series does not increeise. Does this meike sense? 




fid 




Figure 42: Grey graphs: sig (/eft) and power (squared amplitude, logarithmic scale) 
spectrum {right) of a synthetic time series (100 equidistant data points) containing a 
sinusoidal signal without noise. Black graphs: same for a new time series generated by 
inserting 9 additional linearly interpolated points such that the result is an equidistantly 
sampled dataset consisting of 991 points. 



Fig. 42 displays the sig {left) and power (squared amplitude, right) spectrum 
of an equidistantly sampled time series consisting of 100 data points and repre- 
senting a sinusoidal signal with a frequency of 0.075832 d"-^ and an amplitude 
of 1 in Wac/c colour. No noise is added. Based on this time series, a new dataset 
is generated: between each pair of data points, 9 additional, equidistant data 
points are inserted. The observables are assigned by linear interpolation. The 
number of data points in this new time series is thus 991. The corresponding 
spectra are shown in grey colour. The longer dataset generates a peak signifi- 
cance that is roughly ten times higher than the initial one, whereas the power 
spectrum remains practically unchanged. Only the fact that the linear interpo- 
lation does not reveal the "true" observables that would be generated by the 
signal exactly is responsible for a small deviation of the black graph from the 
grey one. 

The explanation for this behaviour is quite similar to the previous section 
"The effect of binning", p. 82, and correspondingly, the effect is mitigated for 
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very noisy signals. Therefore in practical applications, it will be impossible to 
enhance the capability of a frequency analysis by artificially introducing new 
data points. 

15.5. Which sig threshold is reasonable? 

Occcisionally, sigs or sig limits are shifted by log , K denoting 
the number of time series data points. Which sig threshold is the 
true one? 

In fact both versions are correct, but they apply to different questions. 
The version without log ^ refers to the probability that an amplitude level (a 
peak) at a given frequency and phase occurs by chance. The version including 
log corresponds to the probability that the highest out of independent 
peaks occurs by chance. According to the sampling theorem, the DFT of K 
data points (a system with K degrees of freedom) produces « independent 
frequencies in Fourier space, if the sampling is equidistant. Although there is 
no explicit prescription where to find a set of independent frequencies for non- 
equidistant sampling, the system will still have K degrees of freedom, and the 
statistical considerations will be essentially the same. 

A simple experiment makes the situation clearer: we roll a dice and obtain 
the result "4". The probability that such an experiment returns at least "4" 
(i.e. "4", "5" or "6") is, of course, 50%. This refers to the examination of 
an individual peak without respect to all the others in the spectrum. If we roll 
10 dices, the probability for at least one showing "4" or more is dramatically 
higher, namely > 99.9%. This refers to examining the highest out of 10 peaks. 
The increasing probability of obtaining such a result by chance corresponds to 
a decreasing significance of the result. 

16. Keywords Reference 

This section is a compilation of all keywords accepted by SigSpec. A brief 
description of arguments and default values is given. The type of argument is 
provided by either <int> or <double>, and default values are given in paren- 
theses, e.g. (2). Empty parentheses indicate that there is no default setting. 

antialc: adopt <int> (1) 

number of AntiAIC test iterations adopted for the main prewhitening cascade, 
p. 49 
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emtialc: depth <int> (automatic) 

AntiAIC computation depth, p. 49 

parameter: number of iterations used for peak combination testing 
default: —i= , where Pai is the AntiAIC parameter, rounded to the successive 

integer value 

antialc:par <double> 

AntiAIC parameter pai- sig limit relative to maximum for the selection of can- 
didate peaks (0 ... use the sig limit siglimit instead), p. 49 

antialc : siglimit <double> () 

significance limit for the AntiAIC candidate peak selection (no significance limit 
by default; the limit defined by the keyword siglimit is used instead), p. 49 

col:obs <int> (2) 

observable column index (unique), starting with 1, p. 13 

col:ssid <int> () 

subset identifier column index (also multiple), starting with 1, p. 16 
col:time <int> (1) 

time column index (unique), starting with 1, p. 13 

coliweights <int> () 

weights column index (also multiple), starting with 1, p. 14 
comp <irLt> () 

specifies the file indicated by the parameter as comparison dataset, p. 60 

correlograms <int> <int> <iiit> () () 

specifications for correlogram files c#iteration#.d.at, p. 43 
parameters: 

• correlogram order (maximum index lag), default: half of the number of 
time series data points. 
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• number of files to generate (< for all correlogram files, default: no 
correlogram computation), 

• step width (number of iterations) for output. 

csiglimit <double> () 

lower cumulative sig limit, p. 25 

deftype <target/comp/skip> (target) 

specifies the type of dataset to be assigned to a time series by default, p. 60 
DFT 

forces SigSpec to approximate all sigs by signal-to-noise ratios of DFT ampli- 
tudes, p. 79 

dif f : comp 

specifies the DFT amplitude spectrum of the comparison datasets to be calcu- 
lated through a weighted mean of Fourier vectors, p. 61 

dif f : compalign 

specifies the DFT amplitude spectrum of the comparison datasets to be calcu- 
lated as a weighted mean of Fourier amplitudes, p. 62 

This setting forces SigSpec to take into account also correlated signal 
components that lag in phase with respect to each other and the target dataset, 
respectively. 

diff :off 

switches off the differential significance computation (default), p. 62 
freqspacing <double> () 

spacing between consecutive frequencies [inverse time units], p. 21 

harmonics <int> () 

activates the simultaneous analysis of a fundamental plus harmonics (the fre- 
quencies of which are integer multiples of the fundamental) the number of 
which is specified by the parameter, p. 53 
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iterations <int> 

number of prewhitening iterations, p. 24 

Ifreq <double> (0) 

lower frequency limit [inverse time units], p. 19 
Lomb 

forces SigSpec to approximate all sigs by the Lomb-Scargle periodogram, p. 80 
mfstart <int> (0) 

index of the first time series input file to apply the MultiFile mode to, p. 58 
mstracks <irLt> <irLt> () 

MultiSine tracks are written to files m#index# . dat, where #index# refers to 
the signal component in the result files. The parameters are 

1. the maximum number of iterations for which to write entries to the Mul- 
tiSine track files, and 

2. the step width (number of iterations) for output, p. 38. 

The file name may be assigned additional indices for Time-Resolved analysis 
(p. 44) and/or MultiFile mode (p. 57). 

multifile <int> () 

activates MultiFile mode, p. 58 

parameter: maximum index of time series input files (< ... infinite) 

multisine : lock 

forces SigSpec to use the "raw" frequencies, amplitudes, and phases (without 
MultiSine fitting) for the subsequent analysis, p. 24. 

multisine :newton <double> <double> <double> (0.000001 1 0.000001) 
accuracy parameters for the MultiSine least-squares fits 

1. scaling factor for the overall precision of resulting frequencies, 

2. degree of dependence of the frequency accuracy on the peak sig, 
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3. the minimum relative improvement of rms residual between consecutive 
iterations to continue the fitting process, p. 23. 

multisine : unlock 

forces SigSpec to use the frequencies, amplitudes, and phases improved by 
MultiSine least-squared fits for the subsequent analysis (default), p. 24. 

nycoef <<iouble> (0.5) 

Nyquist Coefficient (between and 1), p. 20 

nyscan 

Nyquist Coefficients for the specified frequency range (file nycoef.dat or 
<#inultifile#>. nycoef .dat), p. 21 

osratio <double> (20) 
oversampling ratio, p. 22 

phdist : ceirt 

generates a Phase Distribution Diagram in three-dimensional cartesian coordi- 
nates, p. 37 

phdist : colmodel : lin 

specifies the linear colour model, i.e., phase probability density is used as a 
colour scale, p. 37 

phdist : colmodel : rank 

specifies the rank colour model, i. e., the rank in an ascending sequence of sock 
significances is used as a colour scale, p. 37 

phdist : colour <double> <double> <double> <double> 

A set of phdist: colour lines defines an RGB path for colourising the Phase 
Distribution Diagram, p. 37. 
parameters: 

• red channel (0...255) 

• green channel (0...255) 



p. Reegen 



91 



• blue channel (0...255) 

• scale 

The scale parameter refers directly to probability density of phases in case 
of phdist : colmodel : lin, or to a fractile of probability density on the interval 
[0, 1] in case of phdist : colmodel : rank. 

phdist :cyl 

generates a Phase Distribution Diagram in three-dimensional cylindrical coor- 
dinates (default setting), p. 36. The frequency is the height axis, the phase is 
the azimuth angle, and the radial coordinate refers to the probability density of 
phase. 

phdist: fill <double> (0) 

specifies a filling factor to compute extra frequencies if the difference of phase 
PDFs between two adjacent frequencies is too high, p. 36. 

parameter: number of additional frequencies per unit probability density 
(difference between two adjacent frequencies) 

phdist : phases <int> () 

generates a Phase Distribution Diagram for the sampling of the given time 
series, p. 36. By default, no Phase Distribution Diagram is computed. 

parameter: number of phase angles in the interval [0,7r[, if the maximum 
probability density is < 1. Between 1 and 2, twice this number is used, and 
so on. This enhances the visibility of the Phase Distribution Diagram also in 
frequency and phase regions associated with a very eccentric phase distribution. 

preview <double> 

generates a preview, p. 41. Instead of a prewhitening cascade, only one signifi- 
cance spectrum is computed. All local maxima above the specified significance 
limit are written to a file preview.dat. By default, no preview is computed, 
parameter: significance limit 

profile 

SigSpec generates a file profile.dat containing the sampling profile for the 
given time series, p. 32. By default, the file profile.dat is not generated. 
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This keyword is ignored in MultiFile mode, where sampling profiles are cal- 
culated and written to files whenever required by the program. See "MultiFile 
Mode", p. 57 for further information. 

residuals <int> <int> () 

output files containing residual time series (only residuals.dat for the resid- 
uals after prewhitening all significant compontents by default). The parameters 
are 

1. the maximum number of iterations (files t#iteration#.dat), and 

2. the step width (number of iterations) for output, p. 29. 

The file name may be assigned additional indices for Time-Resolved analysis 
(p. 44) and/or MultiFile mode (p. 57). 

results <int> <int> () 

output files containing a list of significant signal components. The default 
setting is to produce only a file result.dat for the final list. The parameters 
are 

1. the maximum number of iterations for which to write additional result 
files r#iteratioii#.dat, and 

2. the step width (number of iterations) for output, p. 30. 

The file name may be assigned additional indices for Time-Resolved analysis 
(p. 44) and/or MultiFile mode (p. 57). 

siglimit <clouble> (5) 

lower sig limit (0 to deactivate), p. 24 

Sim: add 

add synthetic data to given observable, p. 65 

sim:exp <double> <double> <double> <double> <double> () 

exponential trend, p. 70 
parameters: 

• lower time limit [time units] 
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• upper time limit [time units] 

• scale 

• time zeropoint [time units] 

• exponent 

Sim: of f 

deactivate simulator (default), p. 65 

Sim: poly <double> <double> <double> <double> <double> () 

polynomial trend, p. 68 
parameters: 

• lower time limit [time units] 

• upper time limit [time units] 

• scale 

• time zeropoint [time units] 

• exponent 

full polynomial by multiple declaration with different scales, time zeropoints, 
and exponents 

Sim: replace 

replace given observable by synthetic data, p. 65 

sim:rndsteps <double> <double> <double> <double> () 

random steps, p. 76 
parameters: 

• lower time limit [time units] 

• upper time limit [time units] 

• standard deviation for Gaussian distribution of (constant) step values 

• expected time range for Poisson distribution of steps [time units] 
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Sim: serial <double> <<louble> <double> <<iouble> <<iouble> <clouble> 

serially correlated noise, p. 72 
parameters: 

• lower time limit [time units] 

• upper time limit [time units] 

• scale for standard deviation 

• time zeropoint for polynomial trend of standard deviation [time units] 

• exponent for polynomial trend of standard deviation 

• serial correlation coefficient 

full polynomial by multiple declaration with different scales, time zeropoints, 
and exponents 

Sim: signal <double> <double> <double> <double> <double> () 

sinusoidal signal, p. 66 
parameters: 

• lower time limit [time units] 

• upper time limit [time units] 

• amplitude 

• time zeropoint [time units] 

• frequency [inverse time units] 

Sim: temporal <double> <double> <double> <double> <double> <double> 

temporally correlated noise, p. 74 
parameters: 

• lower time limit [time units] 

• upper time limit [time units] 

• scale for standard deviation 

• time zeropoint for polynomial trend of standard deviation [time units] 

• exponent for polynomial trend of standard deviation 
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• temporal correlation coefficient 

full polynomial by multiple declaration with different scales, time zeropoints, 
and exponents 

sim:zeromeaiL <d.ouble> <clouble> () 

zero-mean adjustment, p. 77 
parameters: 

• lower time limit [time units] 

• upper time limit [time units] 

skip <int> () 

forces SigSpec to skip the file indicated by the parameter, p. 60 
sock:ceirt 

generates a Sock Diagram in three-dimensional cartesian coordinates, p. 34 

sock : colmodel : lin 

specifies the linear colour model, i. e., sock significance is used as a colour scale, 
p. 34 

sock : colmodel : rank 

specifies the rank colour model, i. e., the rank in an ascending sequence of sock 
significances is used as a colour scale, p. 34 

sock: colour <double> <double> <double> <double> 

A set of sock: colour lines defines an RGB path for colourising the Sock 

Diagram, p. 34. 
parameters: 

• red channel (0...255) 

• green channel (0...255) 

• blue channel (0...255) 

• scale 
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For the linear colour model selected by the keyword sock: colmodel : lin, 
the scale parameter refers directly to sock significance. If the rank colour model 
is selected (sock: colmodel: rank), it refers to a fractile of sock significance 
on the interval [0, 1]. 

sock: cyl 

generates a Sock Diagram in three-dimensional cylindrical coordinates (default 
setting), p. 34. The frequency is the height axis, the phase is the azimuth angle, 
and the radial coordinate refers to the sock significance. 

sock: fill <double> (0) 

specifies a filling factor to compute extra frequencies if the sock significance 
difference between two adjacent frequencies is too high, p. 33. 

parameter: number of additional frequencies per unit sig (difference between 
two adjacent frequencies) 

sock: phases <int> 

generates a Sock Diagram for the sampling of the given time series, p. 32. By 
default, no Sock Diagram is computed. 

parameter: number of phase angles in the interval [0,7r[, if the maximum 
sock significance is < 1. Between 1 and 2, twice this number is used, and so on. 
This enhances the visibility of the Sock Diagram also in frequency and phase 
regions associated with a high sock significance. 

spectra <int> <int> 

output files containing spectra (only sOOOOOO.dat for the spectrum of the 
initial time series and resspec.dat for the spectrum of the residuals after 
prewhitening all significant compontents by default). The parameters are 

1. the maximum number of iterations (files s#iteration# . dat), and 

2. the step width (number of iterations) for output, p. 28. 

The file name may be assigned additional indices for Time-Resolved analysis 
(p. 44) and/or MultiFile mode (p. 57). 

target <int> 

specifies the file indicated by the parameter as target dataset, p. 60 
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timeres: range <double> () 

subset interval width [time units], p. 44 

timeres : step <double> () 

step width between subset centres [time units], p. 44 

timeres :w: cos <d.ouble> <double> 

cosine weights, p. 45 
parameters: 

• frequency [inverse time units] 

• phase [rad] 

timeres :w:cosp <double> <double> <double> 

weights according to the power of a cosine, p. 45 

parameters: 

• frequency [inverse time units] 

• phase [rad] 

• exponent 

timeres :w: damp <double> () 

exponential damping, p. 45 

parameter: width [time units] 

timeres :w:exp <double> () 

exponential weights, p. 45 

parameter: width [time units] 

timeres :w: gauss <double> 

Gaussian weights, p. 45 

parameter: standard deviation [time units] 

timeres :w:ipow <double> () 

inverse power weights, p. 45 
parameter: exponent 
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timeres:w:none 

unweighted moving averages, i.e. a rectangular filter, p. 45 

ufreq <double> () 

upper frequency limit [inverse time units], p. 20 
win 

SigSpec generates a file win. dat containing the spectral window for the given 
time series. By default, the file win. dat is not generated, p. 31. 

17. Online availability 

The ANSI-C code is available online at http: //www. sigspec . org. For further 
information, please contact P. Reegen, peter. reegen@univie.ac. at. 
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